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WHY RANDOMIZE? 


B. G. GREENBERG* 


Department of Biostalistics 
University of North Carolina 
Chapel Hill 


INTRODUCTION 


N ASSIGNING TREATMENTS to experimental units in biological experi- 
ments, it is not uncommon to observe that the sequence of treat- 
ments constitutes an ordered set. This situation is analogous to, yet 
slightly different from, the problem in agricultural experimentation 
where the experimental units (e.g., neighboring plots) are usually highly 
correlated. Both types of experiment emphasize, however, the need 
for a random arrangement of the treatments on the experimental units. 
To cancel the effect of this order, one might be tempted to balance 
the allocation of treatments by resorting to a systematic arrangement 
omitting any element of randomization. In fact, Gosset (6) strongly 
advocated the use of such systematic designs over random ones in the 
case of field plots. 

The systematic scheme, besides being simple to execute, may also 
appeal to the researcher because once in a while it reduces the estimate 
of error. In violating one of the fundamental assumptions of least 
squares, these estimates of error are defective. This may result in 
serious consequences in subsequent tests of significance or when calcu- 
lating fiducial limits of the magnitude of the difference between treat- 
ments. Yates (11) discussed this problem at length and refuted any 
real advantages which Gosset assigned to systematic designs. Yates (9), 
(10) also stressed the importance of randomization in other aspects of 
experimental work. 

As indicated above, the distribution of errors in a systematic ar- 
rangement raises a question concerning the applicability of the ordinary 
methods of analysis. The problem deserves further attention in sta- 
tistical theory. 

The purpose of this paper is to present an argument against syste- 
matic designs in the case when the sequence of treatments is highly 
correlated. The occurrence of ordered sequences of treatments and 
correlated environmental conditions superimposed upon the experi- 
mental units are frequent in medical and public health research. Some 
typical experiences in this field are cited in a later section. 


*The author wishes to acknowledge helpful suggestions from Drs. H. L. Lucas, Jr. and H. C. 
Batson in preliminary discussion of this problem. 
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It is hoped to show that instead of relying on systematization tc 
balance the effect of order, local control (randomized blocks, latin 
squares, and covariance) with randomization, has the effect of removing 
this disturbance and the errors can be treated as if they were uncorre- 
lated. It is not planned to survey all the possible designs that might 
be employed in this set of uniformity data, but simply to accumulate 
evidence on the advantage of randomization with some of the more 
commonly known types of design. 


DESCRIPTION OF THE MATERIAL 


This study and its data are a by-product of an investigation in ex- 
‘perimental designs for parasitological research. Material from para- 
sitology is particularly appropriate for this discussion for two reasons. 
The first is that in this type of research, the mechanism of assigning 
treatments to animals or other experimental units is an influential and 
important part of the experiment. The second is that it is common 
practice among laboratory workers in this field to rely upon a syste- 
matic scheme to balance or equalize the allocation of treatments to 
experimental units. 

For example, in an experiment to measure the development of im- 
munity in mice to a particular nematode (Trichinella), young mice are 
matched according to sex and placed into two groups, one experimental 
and the other control. After a series of stimulating infections among 
the experimental group only, the immunity is measured by challenging 
both groups with a like number of organisms. To challenge each mouse 
with the infecting parasites, a small amount of solution is prepared 
containing the necessary larvae. Thus, the solution might be such that 
0.05 cc. is supposed to contain 200 larvae. Each mouse is then given 
0.05 cc. by mouth from a syringe. Mice from the two groups are in- 
fected alternately, in the hope that this will cancel variations in dosage. 

A systematic scheme of this kind formed by an odd, even, odd, even 
order of injecting the mice during challenge may sometimes lead to a 
serious bias. It will also invalidate the estimate of error. To demon- 
strate the extent of this error as well as the Jimitations of some cor- 
rective measures, the following set of uniformity data were assembled. 

Following the previously described procedure, instead of inoculating 
mice with the larvae, thirty dosages were placed on individual slides for 
the purpose of counting the number of larvae. The odd-numbered slides 
were considered experimentals, the even-numbered ones, controls, ex- 
actly as it would have occurred in an experiment. The uniformity data 
for these thirty slides are presented in Table 1.* 


*The author is indebted to Mr. Charles Baughn for these data. 
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TABLE 1. UNIFORMITY DATA ON NUMBERS OF LARVAE IN 30 INOCULA ASSIGNED 
BY AN ALTERNATION SCHEME. 


Difference 

Set Experimental Control _| (Control — Experimental) — 

1 200 207 +7 

2 204 206 +2 

3 208 3 

4 211 213 + 2 

5 212 217 + 5 

6 220 221 +1 

7 225 235 +10 

8 * 238 238 0 

9 238 237 - 1 

10 238 246 + 8 

11 243 243 0 

i2 251 252 +1 

13 259 261 + 2 

14 275 277 + 2 

15 279 283 + 4 
Mean 233 .6 236.3 + 2.7 


PRELIMINARY ANALYSIS OF UNIFORMITY DATA 


It can be seen that the experimental animals received a smaller 
number of infecting organisms than the controls. The average number 
of larvae inoculated in the experimental animals was 233.6, whereas it 
was 236.3 in the controls, or a difference of 2.7 larvae. 

One explanation for this difference lay in the syringe and its method 
of use. The larvae were relatively large in size and clung to the various 
surfaces of the needle and syringe causing incomplete evacuation. The 
syringe was then reinserted into the stock solution containing the in- 
fecting larvae, and after stirring by flushing the syringe several times, 
0.05 ce. was again drawn off. Due to incomplete evacuation, the solu- 
tion containing the larvae was gradually being made more virulent in 
terms of the relative number of larvae. Parasitologists have recognized 
this build-up in the relative density of the larvae. As a safeguard 
against the effect of this exponential increase, many do not permit the 
larval solution to fall below a prescribed minimum volume. 

A flushing of the syringe in clear saline solution before reinserting it 
into the stock solution revealed that sufficient larvae remained in the 
syringe to raise each subsequent dosage by two to five larvae. Thus, 
each of the thirty inocula, on the average, exceeded the previous inocu- 
lum by this amount, 
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The question naturally arises whether this difference would be im- 
portant, and if so, whether it is also statistically significant. 

From the standpoint of both the statistician and the experimenter, 
one can state this difference might be quite important at times, partic- 
ularly in quantitative immunology. Despite the fact that the average 
difference of 2.7 represents only a 1% variation in the inoculating dose, 
the bias is always in the same direction. Under the circumstances de- 
scribed herein, the latter treatment will regularly receive, on the average, 
the larger of the two inocula. On the other hand, this difference may 
often be negligible in experiments designed to show large treatment 
differences. 

Still more important from the statistician’s viewpoint is that the 
principle of using more animals to reduce the variation of the sample 
means might be vitiated and possibly act in reverse. As in the present 
case, where the estimated value of o” began to inflate rapidly, the esti- 
mated value of o”/N may increase in:'tead of decrease. Moreover, addi- 
tional animals would further exaggerate the differences between the 
pairs. 

The problem of statistical significance of the difference between the 
two groups depends upon the method of analyzing the results. If the 
results from the fifteen animals on each treatment are analyzed as if 
they were completely randomized, then the difference of 2.7 larvae turns 
out to be not significant. This non-significance, on the other hand, is 
probably due to the magnitude of the extraneous variation caused by 
the trend in the experimental technique. That is, the observed mean 
square represents the sum of the normal error term plus the additional 
variation induced by the increasing dosages. 

If one pairs the sample values to make a test of significance, under 
the assumption that all sample values are independently distributed 
with the same variance, Walsh (7) has shown that the power efficiency 
of the new ¢-test is only 90% at the 1% level of significance. In this 
instance, it is definitely advantageous to pair, however, since a high 
positive correlation (0.99) exists between the two treatments with regard 
to the order of inoculation. Thus by pairing, the discrepance of 2.7 is 
significantly different from zero at the 1% level. Though significant, 
one should still interpret the difference in the light of its importance on 
a particular experiment. The bias may be small relative to differences 
of practical importance. 

This set of uniformity data demonstrates, nevertheless, the ever- 
present danger of concealing real treatment differences, or finding sta- 
tistically significant differences when none exist. The belief that a 
systematic arrangement such as A B A B would cancel the effect of 
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order was in error. it also fails to fulfill the assumption of randomness 
implicit in current methods of analysis. By other techniques, such as 
randomization with well known designs, it will be shown how one might 
remove the bias toward any one treatment and still use ordinary methods 
of analysis. 

Covariance analysis on order of inoculation will also be helpful if 
the nature of the time trend is known, or can be estimated. Even if it 
is linear, however, the computational details are such as to preclude its 
widespread use unless no other method is available. Of course, for com- 
pleted experiments where the design has already been faulty, covariance 
may be the only method of salvaging the results. In the present case, 
the adjusted mean difference by covariance became 0.04 and this was 
not significantly different from zero. 

One might suggest substitution of a different systematic arrange- 
ment, such as A B B A, which is more adapted to the particular cir- 
cumstances present in this experiment. There are other types of syste- 
matic arrangement, such as those suggested by Gosset, but a general 
endorsement of any systematic plan is undesirable at the present time. 
Besides invalidating an estimate of error, the entire class of systematic 
schemes is inherently dangerous unless almost complete knowledge of 
important factors affecting the variation in treatments is known before- 
hand. Other unknown factors may coincide directly with the pattern 
used and cause apparent treatment differences. As Cochran and Cox (2) 
have pointed out so effectively, randomization is a form of insurance 
for the experimenter. 

It is not implied that randomization is a panacea to overcome poor 
laboratory technique. Although it may be an indispensable tool, its 
usefulness is limited. In the present instance, the size of the infecting 
dose was becoming more virulent at an accelerated rate. The reason 
for this acceleration was that the reservoir of infecting material was 
diminishing in absolute quantity. This resulted in a more pronounced 
effect when a relatively fixed amount of larvae was re-introduced into 
the reservoir. 

Besides randomization in the present experiment, the use of a 
different syringe for each inoculum is recommended. With any sizeable 
number of animals, however, this imposes practical difficulties upon 
both facilities and the overall time required for inoculation. A satis- 
factory compromise is to rinse the syringe in clear water or saline 
solution between each inoculation. Although a few larvae would still 
remain in the syringe, this number would be trivial as compared to the 
former method. 
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RANDOMIZATION SCHEMES 


Before looking at some randomized designs, it is worthwhile to look 
in more detail at the results of the two previously discussed systematic 
arrangements, with and without linear covariance adjustment. These 
results are presented in Table 2. 


TABLE 2. ANALYSIS OF UNIFORMITY DATA WITH SYSTEMATIC ALLOCATION 


Without covariance With covariance 
Method Method 
of of Mean Estimate — Mean Estimate 
allocation analysis treatment of vari- df. | treatment of vari- | df. 
difference ance difference ance 
Unblocked 2.67 613 .8048 28 —0.04 29 .3359 27 
Odd, even 
Blocks 2.67** 6.2619 14 —t —t 
Unblocked 0.53 615 .6333 28 0.35 29.3017 27 
Odd, even, 
even, odd Blocks 0.53 9.9190 14 0.35 6.6956 13 
**Significant at the 1% level. 


tCovariance adjustment of the error term and treatment means can not be applied here. The 
removal of block effects causes both the sum of products and the sum of squares of the independent 
variate (viz., order of inoculation) terms to be zero. 


It can be seen in Table 2 that covariance adjustment is effective 
in removing the bias and in reducing the error term. When the data 
are analyzed as a blocked or paired arrangement, the error term is 
restored to its probable original magnitude, with no effect on the mean 
difference. 

In the case of odd, even allocation, covariance is not applicable 
when the data are treated as in a blocked design. The removal of block 
effects in that case reduces the error term but automatically excludes the 
use of covariance to adjust the difference between the treatment means. 

In the second systematic arrangement, removal of block effects is 
again seen to be the most helpful in reducing the error term. Further 
reduction is also possible with covariance as well as helping to minimize 
any bias in the method of allocation. 

As indicated previously, this is not a’ complete survey of all possible 
systematic designs. These two, however, are the ones most commonly 
employed, easily executed, and demonstrate the effects of covariance 
and blocking. 

There are three kinds of randomized designs that will be illustrated 
in the present example. The first is a completely randomized (un- 
blocked) design. The one advantage gained by this type of randomiza- 
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tion is that the bias toward a particular treatment is removed—either 
experimental or control animals might receive the larger set of dosages. 
It will be seen, however, that this accuracy is obtained with no gain 
in precision. 

There are 30!/15! 15! = 155,117,520 possible permutations or ran- 
domization schemes by this method of allocation. It being impractical 
to study the entire group of possible permutations, a sample of 200 
randomization arrangements was chosen. For each individual randomi- 
zation arrangement, a consecutive set of thirty random numbers was 
chosen from tables found in Fisher and Yates (3) so as to be classifiable 
into an equal number of odd and even integers. All the odd numbers 
were considered, say controls, and the even integers were the experi- 
mentals. By some simplifying calculations,* it was noted that 9.5% of 
the 200 mean differences were significantly different at the 5% level, 
and 3.5% were significant at the 1% level. Both of these percentages 
differed significantly from their respective expected percentages. The 
mean differences, however, appeared to favor neither experimental nor 
control. 

In table 3 are presented the ranges of the mean differences and the 
mean squares, with and without covariance, for the complete 30!/15! 15! 
possible permutations of this design. The mean squares in the analysis 
of variance of the completely randomized designs did not show an 
appreciable reduction in size from the systematic schemes. They tov 
required the adjustment by covariance to eliminate the induced varia 
tion caused by the increased dosages. 

The lack of precision and excessive number of significant differences 
in the completely randomized design are due to the lack of independence 
of the thirty inocula.t{ The estimated value of the serial correlation 


*To avoid calculating a test of significance of the mean difference for each of the 200 schemes, one 
can determine instead the least significant difference in treatment totals at the a% level. Corresponding 
to each absolute difference, there is only one standard error of the mean difference. 

Let X,, represent the experimental values i = 1, 2, ..., 15 and X,, the control values. By sub- 
stituting ¢ = 2.048 for 5%, and t = 2.763 for the 1% levels of significance, 


(7048 — 2 X.,) 
1,673,060 — (048 — + 


28(15) 
can be solved for  X,; (and 2 X,; = 7048 — = X-_;). Thus, it was found that 260 is the least signifi- 
cant absolute difference of the totals at 5%, and 334 is the value at 1%. 

TA referee suggested that the discrepance arises ‘because the errors, although randomized to treat- 
ments, are from an approximately rectangular finite population. The errors are evenly distributed 
over the whole range, with a small subsidiary variance as contrasted with exact values assumed for 
observations in an ordinary finite population, but not independently distributed with variance equa] 
to that of the whole population as in random sampling from an infinite population,” 


i= 


316 BIOMETRICS, DECEMBER 1951 


TABLE 3. POSSIBLE RANGES OF MEAN DIFFERENCES AND MEAN SQUARES FOR 
THREE RANDOMIZED DESIGNS, WITH AND WITHOUT COVARIANCE 


Without covariance With covariance 
Number of Range Range 
Design possibilities d.f. 
Meapr dif- Mean Mean dif- Mean’ 
ferences square ferences square 
30! +39 .60** 195 .6667 +8 9.3462 
Completely to to 28 to to 27 
randomized 15! 15! 0 615.7095 0 29 .3363 . 
Randomized +3 .20* 4.5857 | +2 .70* 4.8106 
blocks (pairs) aad to to 14 to to 13 
0 10.0714 0 6.7436 
15! 2.50 2.8929 
Cross-over to to 13 -—t —t 
718! 0 6.7424 


*Significant at the 5% level. 
**Significant at the 1% level. 
tCovariance adjustment of the error term and treatment means can not be applied here. The 
removal of order of inoculation effects causes both the sum of products and the sum of squares of the 
independent variate (viz., order of inoculation) terms to be zero. 


between each dose and its predecessor was r = 0.99. This correlation, 
based on 29 observations, was equivalent to the estimated correlation 
between pairs which was based on 15 observations. 

‘To sum up the characteristics for the completely randomized design 
in this instance, it removes the bias, has too many significant mean 
differences, and retains a considerable amount of imprecision. 

The second type of randomized design is one which is based upon 
pairs, or randomized blocks of two. To allocate the treatments in this 
case, one can toss a coin for each pair to determine which group should 
receive the first of the two inoculations. There are 2’° = 32,768 possible 
combinations of this design. Only 200 of them, however, have mean 
differences significant at the 1% level. This is slightly below the ex- 
pected number of 328. One reason for this is the discrete nature of the 
mean differences. That is, in order for a design to have a mean difference 
significant at the 1% level, the difference between experimentals and 
controls must sum to at least 40.49, whereas 42 is the nearest possible 
total difference above that number. 

The 200 significant mean differences are evenly divided so that 
neither treatment is favored. Table 3 presents the ranges of mean 
differences and mean squares for the 2'° combinations of this design, with 
and without covariance. It can be seen at a glance that the mean squares, 
on the average, will be lower here than by any of the preceding methods. 
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In addition to unbiassedness and high precision of this last design, 
there are two other advantages to recommend blocking as a general rule 
in this type of problem. First it can be easily extended to blocks of. 
three, four, etc., in the event that there aré more than two treatments. 
This may be a disadvantage for many types of systematic design. The 
second reason is that in addition to a sequence of treatments varying 
according to some non-random pattern, the effect of other varying en- 
vironmental factors may be superimposed upon the blocks. For ex- 
ample, the arrangement of animals in a laboratory or animal room may 
cause additional fluctuations not previously taken into account by the 
experimenter. If the cages are placed in rows, alternating the experi- 
mental and control animals, the effect is analogous to the alternation 
method of the allocation of treatments. If the temperature in the room 
varies from floor to ceiling, as it usually does, then the bottom row of 
each pair consistently experiences a relatively lower temperature. 
When the animals are in pairs, the blocks may be fitted easily into a 
random arrangement in the room. 

With everything good, there is usually some bad. In the randomized 
blocks design, one sacrifices degrees of freedom quite heavily. It would 
be an illogical gain in precision to reduce the degrees of freedom for 
error to less than, say 10. In such cases, one should resort to covariance* 
to overcome time trends and to reduce the error mean square. 

It can be seen in Table 3 that the lowest possible mean square for the 
randomized blocks design is greater after covariance adjustment than it 
was before. This illustrates another type of situation where the adjust- 
ment by covariance was inadvisable since the regression was not sig- 
nificant. The loss of one degree of freedom caused the mean square to 
increase in this case. 

Having reached the point where blocking seems desirable, another 
step may be taken to restrict the randomization process still further. 
From observation, it was clear that the first member of each pair tended 
to be smaller in number than the other one. It would seem desirable, 
therefore, that some action be taken to guarantee that both experi- 
mental and control animals receive equal preference with regard to this 
difference. That is, instead of leaving it to chance that the random 
arrangement will favor neither treatment, take steps to ensure that each 
receives one half of the “good” as well as the “bad”. This design, 
popular in dairy husbandry and biological assay, is the cross-over type. 


*An interesting point in the covariance adjustments for all designs was that the regression coeffi- 
cients ranged between 2.60-2.70 larvae for each increase in inoculation order. This confirmed the 
original observation by actual count that there were between two and five more larvae in each successive 
Roculum. 


: 
i 


318 BIOMETRICS, DECEMBER 1951 


In general, of the 2°" possible arrangements for the randomized 
blocks design, where m = one-half of the number of blocks or replicates, 
there are (2m)!/(m!)? possible cross-over designs available. The analysis 
of variance takes the following form: 


d.f. 
Blocks 2m — 1 
Order of inoculation 1 
Treatments 1 
Error 2m — 2 
Total 4m —1 


To obtain one of the possible cross-over designs at random, one 
selects m random numbers from the first 2m integers. These m numbers 
may represent the block or replicate number in which the experimentals 
are the first member of the pair, and the remaining m numbers are those 
blocks in which the experimental animals are the second member. 

In our illustration, the number of blocks, or replicates is 15, which 
is odd-numbered and not divisible into two identical portions. This 
means that one group of animals must receive 7 inoculations as the 
first member of the pair, and the other group of animals 8 inoculations 
as the first member of the pair. The number of such designs will be 
2(15!/7!8!). It is usually stated that the number of blocks of replicates 
in the cross-over design must be a multiple of the number of treatments. 
For example, in the present illustration, one would have required 14 or 
16 replicates. This so-called restriction is not necessarily true, although 
it is certainly a desideratum from the computational standpoint. If the 
number of replicates is odd, as in this example, the analysis must follow 
the lines of disproportionate sub-class numbers by the method of fitting 
constants. This is equivalent to considering that there are two missing 
values on the sixteenth block. 

-According to this method, the analysis of variance is: 


d.f. 
Blocks 14 
Order of inoculation (unadjusted) 1 
Treatments (adjusted for order) 1 
Treatments (unadjusted) © 1 
Order of inoculation (adjusted for 1 

treatments) 

Error 


13 
Total 29 
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One can also obtain adjusted estimates of the mean differences be- 
tween treatments from the regression coefficients. The ranges of the 
results of the 12,870 possible arrangements of the cross-over design are 
thus presented in Table 3. 

It will be noted that the use of covariance adjustment is precluded 
in this design. By removing the effects due to order, the sums of squares 
for the independent variate and the sum of products terms automatically 
become zero. This is not a serious shortcoming from the standpoint of 
precision since it can be seen that the range of mean squares is already 
at a low level. It. might be somewhat of a handicap in debarring the 
adjustment of the mean treatment differences. One hopes, however, 
that treatment differences will be negligible in this design since each 
group of animals receives the same degree of preferential handling. It 
can be shown that there are 86 possible arrangements which will have 
significant differences between the two groups of animals at the 1% 
level. This is slightly less than the expected number of 129, and is 
probably explained by the discrete nature of the data. 

The cross-over design, therefore, has the most precision, is unbiassed, 
but may present computational difficulties. Another limitation in its 
use is that the design assumes that the difference between paired mem- 
bers within a replicate will remain constant from replicate to replicate. 
This was seen not to be the case with these data when the number of 
replicates is large. There are two feasible alternatives in that case. 

If the degrees of freedom for error are large, a more useful design 
can be obtained by dividing the series of replicates in 2 X 2 latin squares. 
In such an arrangement, one removes m degrees of freedom from error 
in place of only 1 for order of inoculation. 

The recommendation of m 2 X 2 latin squares in this example, recalls 
the interesting series of papers published in England about fifteen years 
ago on the subject of randomization versus systematization. Using the 
uniformity data discussed by Gosset (5) to advocate systematic designs, 
Barbacki and Fisher (1) arranged four successive observations in a 
block or sandwich, such as ABBA or BAAB. 

One can then compare the mean difference of (A — B — B + A) 
with its variance estimated from the m sandwiches. They concluded 
that a random choice for each sandwich was more precise than a syste- 
matic arrangement. 

Gosset (6) replied to this criticism by pointing out a slightly modified 
systematic arrangement of sandwiches. It consisted of m 2 X 2 latin 
squares with certain restrictions so that randomization could not be 
applied. His arrangement, in fact, compared rather favorably with that 
of Barbacki and Fisher. What Gosset failed to do, however, was to 
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compare this one fixed set of latin squares with all possible random 
arrangements. Yates (11) did some investigation along these lines and 
concluded that random 2 X 2 latin squares were more precise than 
Gosset’s balanced arrangement. In all fairness to Gosset, his death 
prevented a rebuttal to these arguments. 

The results of Yates appear as acceptable in this example as for the 
data used by Gosset. Although all-possible systematic designs are not 
considered in this present paper, this only serves to show the arbi- 
trariness which must be involved in the selection of a systematic ar- 
rangement. After the data have been compiled, it is easy to discover 
what particular arrangement will have the smallest estimate of error. 
The experimenter, however, must have this information in advance, and 
it must be «2pable of satisfactory repeatability. 

A second possibility which may be more desirable is to revert to the 
randomized blocks design and use covariance adjustment on order of 
inoculation. The covariance should be of an order higher than linear 
if the difference within replicates is undergoing change from replicate 
to replicate. 


APPLICATION TO PUBLIC HEALTH AND MEDICAL RESEARCH 


In public health problems a common method of selecting patients in 
a hospital or clinic for research investigations is as follows: The first 
patient to enter is given the treatment under consideration, but it is 
withheld from the second (or a placebo given), administered to the third 
patient but not to the fourth, and soon. This is the odd, even approach 
in another form. 

The result of this kind of t treatment allocation to subjects is es- 
pecially defective when it is combined with the secular variations in the 
infectivity and/or virulence of a bacterium or virus, such as are be- 
lieved to occur during epidemic waves. This variation in microbial 
virulence is strongly supported by Greenwood, Hill, Topley and Wilson 
(4), although a dissenting viewpoint is expressed by Webster (8). 

Without discussing experimental epidemiology at this point, let us 
assume that an epidemic is precipitated by either a change in virulence 
and infectivity, or a reduction in host resistance. Furthermore, that 
the nature of either of these changes resembles the shape of the epidemic 
curve itself, such as that indicated in Figure 1. 

If only the first chronological half of the total number of the cases is 
studied in conjunction with a scheme alternating first experimentals and 
then controls, the controls consistently have infecting organisms with 
higher virulence (or lower host resistance). On the other hand, if the 
second half is selected, the reverse situation holds true. Continuing this 
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Number 

of Cases 
or Deaths 
During an 
Epidemic 


Time 


FIGURE 1. DIAGRAM ILLUSTRATING VARIATION IN THE OCCURRENCE OF CASES 
OR DEATHS DURING AN EPIDEMIC. 


reasoning, if the half selected is from the middle section, or if all cases 
are included, then the two groups are exactly alike. These three methods 
of selecting which cases should be studied give entirely different results 
when randomization is omitted. 


In addition to this type of research problem, systematic arrange- 
ments should be avoided when any of the factors below are applicable. 


a) The quality of the treatment changes with time according to a non- 
random pattern because of the personnel and the learning pattern 
involved. For example, the skill of technicians usually increases with 
the progress of the experiment. 

b) The quality of a drug may be improved as better methods of refinement 
are developed. 

c) An infecting organism may develop resistance to a group of drugs, a 
group of chemicals, or to methods of therapy. 

d) The gradual deterioration of equipment and measuring devices. This 
may apply to various mechanisms of irradiation as well as measuring 
instruments such as a colorimeter or spectrophotometer. It is particu- 
larly important and the effect obvious when an irradiating material 
deteriorates in effective radiological activity with the passage of time or 
usage. 

e) Other miscellaneous environmental factors, such as temperature, diet, 
humidity, etc. These are too numerous to mention the possible ef -ct: of 
each individually. 


This listing appears so all-inclusive, it is reasonable to conclude that 
systematic designs should always be by-passed in favor of a random 
arrangement. 
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SUMMARY 


A systematic allocation of treatments to experimental units was in- 
vestigated in the field of parasitology and found to be biassed and pro- 
vided too large an estimate of error. Corrective measures of covariance, 
randomization, and blocking are discussed and some limitations of each 
pointed out. When the sequence of treatments follows a non-random 
‘pattern, it was recommended that the cross-over design be employed 
with a random element involyed. In cases where the difference between 
pairs changes from replicate to replicate, either 2 K 2 latin squares or 
the randomized blocks design with covariance adjustment may be 
employed. Applications of the results to medical and public heaJth 
problems were pointed out. 
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STATISTICAL ASPECTS OF THE SIMULTANEOUS DETEC- a 
TION OF THYROID AND THYROTROPHIC HORMONES IN : 
HUMAN SERA, BASED ON THE DATA OF 
D’ANGELO AND GORDON 


C. H. STEINMETZ 


Department of Zoology 
Indiana University! 


7 DATA presented by D’Angelo and Gordon (1950) and D’Angelo, 
Gordon and Charipper (1942) indicate that the stasis tadpole is a 
suitable and sensitive test animal for the bioassay of thyrotrophic hor- 
mone. It was also reported by D’Angelo and Gordon (1950) that it is 
possible to determine in a relatively quantitative manner, both thyroid 
and thyrotrophic hormone levels simultaneously in the same sample of 
serum by an analysis of the effects of the injected sample on develop- 
mental advance and thyroid follicular cell height. Such a method 
should be extremely valuable in advancing our knowledge of the thyroid- 
anterior pituitary gland interactions. It was felt, however, because 
of the interactions between these glands, that a further analysis of the 
original data of D’Angelo and Gordon was necessary for complete 
understanding of the method. It is fortunately possible to examine the 
reported data by means of a factorial analysis which has previously been 
applied to problems of endocrinology by Snedecor and Breneman (1945) 
and Breneman (1951). A factorial analysis is a convenient statistical 
method for evaluating the effects of a given group of treatments, both 
individually and with all possible interactions. The fact that this 
method permits evaluation of interactions between the treatments is 
particularly useful in this case inasmuch as it is necessary to evaluate 
the possible interactions of two hormones known to have physiological 
interactions if they are to be quantitatively determined. 


1Contribution No. 475 from the Department of Zoology, Indiana University. 
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DATA OF D’'ANGELO AND GORDON (1950) FOR SIMULTANEOUS DETECTION OF TSH 
AND THYROXINE COMPARED WITH THE RESULTS OF A FACTORIAL ANALYSIS OF 
THE SAME DATA : 


Cell height* Hindlimb increase‘ 
Original Total Original Total Number 
Data Effect? Data Effect? | Animals 
Control 4.8 +0.2's 0.5 + 0.2% 10 
a Thyroxine 
(10 gamma%) | 5.4 + 0.2 1.3 3.5+0.1 6.6°* 10 
b TSH JS. 
unit) 7.8+0.2 9.9** |5.7+0.4 13.0** 10 
ab Thyroxine and| 
TSH 8.1+0.4 —0.7 7.6+0.3 —3.8** 10 
c Serum (normal 
human) 5.8+0.1 2.1** |5.2+0.3 20 
ac Thyroxine and 
serum 6.2+0.4 —0.5 7.44+0.4 —3.2** 20 
be TSHandserum| 8.1 + 0.3 —-1.5* |8.42+0.5 —5.6** 20 
abc Thyroxine, 
TSH andserum] 8.1 + 0.1 -0.1 7.92+0.4 —1.6 20 


*indicates 5% level, and ** indicates 1% level of significance as tested by the “‘t’ test. 

lafigures represent mean in micra + standard error of the mean. 

ibfigures represent mean in mm. + standard error of the mean. 

*these columns give the total effect of the treatment, e.g. total effect of thyroxine (10 gamma%) 
A = (a — I) + (ab — b) + (ac —c) + (abe — bc), and in the case of a combination of treatments 
the number represents the amount of interaction, i.e. if there were no interaction between TSH and 
thyroxine (hindlimb increase) the total effect, AB, would not differ significantly from sero. 

*%the unweighted means were used in the calculations for the analysis of variance and the fiducial 
limits since it was only reported that 50 to 150 cells were measured in glands of 6 to 9 animals. 

‘it is clear that the variance increases with increasing dosage, but the assumption of homogeneity 
apparently affects the conclusions less than the other approximations necessary. 


Cell height. The results of the factorial analysis show, as D’Angelo 
and Gordon reported, that there is a marked response to the administered 
thyroid stimulating hormone — TSH, (+ 9.9), and also that normal 
serum has a significant effect, (+ 2.1), on the cell height. D’Angelo and 
Gordon also state that a combination of these treatments, either in the 
presence or absence of thyroxine, results in an effect not significantly 
different from that of TSH alone. However, when analyzed factorially, 
there is a negative interaction, (— 1.5), between the TSH and serum 
when administered simultaneously, which is significant at the 5% level. 
There are some possible explanations for this discrepancy. First, it 
should be noted that there is evidence that the response to administered 
TSH follows a curvilinear regression so that a uniformly increasing 
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response would not be expected at all dosage levels. Thus the inter- 
action could be due to a small amount of TSH in the serum, however, 
it was also reported that this serum had considerable activity that could 
be attributed to the presence of thyroid hormone. The most tempting 
explanation is that this thyroid principle actually inhibits the action of 
exogenous thyrotrophic hormone on the resting follicular cell of the 
thyroid of the stasis tadpole. A similar phenomenon has been reported 
by several investigators, including Cortell and Rawson (1941) working 
on the hypophysectomized rat, and it is interesting to note that the 
stasis tadpole is essentially hypophysectomized with respect to its TSH 
activity, (D’Angelo et al, 1941). In any event, it does not seem that 
the response of the resting thyroid follicular cell to a combination of 
treatments, i.e., TSH and serum, both of which presumably possess a 
significant amount of thyrotrophic activity, is appropriate for the de- 
termination of TSH in a quantitative manner. The fact that a combi- 
nation of all three treatments produces a total effect that shows only a 
negligible amount of interaction when analyzed factorially, (— 0.1), 
perhaps makes the interpretation of inhibition of TSH activity by the 
thyroid hormone untenable, but this seems unlikely when the effects on 
the hindlimb increase are considered, and may indicate that the amount 
of interaction may vary according to the relative amounts of the hor- 
mones present. 

Hindlimb Increase. D’Angelo and Gordon report on the basis of the 
original data that all three treatments are effective in causing a sig- 
nificant increase in hindlimb length, and that in combination, the 
effects of the hormones are additive, while serum does not potentiate or 
diminish the effects of either hormone. When the data are analysed 
factorially, the separate treatments are seen to have significant positive 
effects, but, in all cases where the treatments are combined, there are 
negative interactions which are significant at the 1% level except where 
all three are combined. These interactions make some interesting in- 
terpretations possible, although the fact that there is a diminishing 
response to additional TSH possibly accounts for some of these results. 
Presumably the serum possesses both TSH and thyroid activity so that 
in any combination of the three treatments both thyroid and thyro- 
trophic principles would be represented. On the basis of this, it would 
seem that the results of the factorial analysis of the original data furnish 
rather striking evidence that inhibition of TSH activity by the action 
of the thyroid hormone on the thyroid gland actually occurs in the 
stasis tadpole. Furthermore, these results show that the amount of 
interaction seems to vary with the relative proportions of the hormones 
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administered and may be present to a very significant degree, partic- 
ularly with respect to hindlimb increase. 

Thus, when the results of the factorial analysis are considered, it 
would seem to be impossible to determine both thyroid and thyrotrophic 
principles in the same sample of serum in any sort of quantitative 
manner due to an actual inhibition of TSH activity by the thyroid hor- 
mone. The statis tadpole does appear to be a relatively sensitive test 
animal for the bioassay of TSH alone and D’Angelo and Gordon have 
shown in this same paper that both hindlimb increase and cell height 
show a significant positive correlation with the dosage of TSH. It seems 
likely that, provided the effects of the thyroid hormone can be obviated 
by some other method, the stasis tadpole would serve as a very suitable 
test animal for the bioassay of TSH inasmuch as there are two variables 
that can be measured in the same animal and both bear a direct relation 
to the-dosage level of TSH. 
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WHY I PREFER LOGITS TO PROBITS* 


JoserH Berkson, M.D., D.Sc. 


Section of Biometry and Medical Statistics, 
Mayo Clinic, 
Rochester, Minnesota 


I AM GRATEFUL to the organizers of this meeting for affording me the 
opportunity to try to clarify a few points with respect to my use of 
the logistic function in bio-assay. From what has appeared in the 
literature, as well as from discussions that I have had with mathe- 
matical statisticians, it appears that the issues are, in considerable 
degree, confused. Aside from some specific points that are of interest, 
I believe that a discussion will be useful in calling attention to a number 
of questions that affect the application of statistical methods to scientific 
experiments in general. I must, however, warn you that I am not 
going to attempt a comprehensive review in the brief period available, 
and I shall be unable to give the attention due the great amount of work 
done by others. Instead I mean merely to present a few ideas that I 
have accumulated in the course of many years of use of the function. 

I have enjoyed a good deal of leg-pulling as a result of my having 
baptized a statistical unit as the “logit,’”’ and I imagine the users of 
“‘probits” have had similar experiences. It has been suggested that the 
practice of. christening the linear transforms of functions—or, as Pro- 
fessor Gumbel writes me they should be called, “reduced variates” — 
each with its own diminutive nickname, is an exhibition of juvenile 
exuberance. Still, at least where no applicable name is already avail- 
able, it can be useful. However, I wish to make it clear that for me 
these entities have no independent sovereign being as ‘‘metameters”’; 
they are here merely a convenient way of graphically representing and 
fitting a function, and what I wish to speak about is a comparison of 
the logistic curve and the integrated normal curve, the equations for 
which are: 


Logistic curve: 


*Read at a joint meeting of the Institute of Mathematical Statistics and The Biometric Society 
(ENAR), Oak Ridge, Tennessee, March 16, 1951. 
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logit (y) = In = + Be 
Integrated normal curve: 


1 a+Pz-5 
Fal /2 dt 


probit (y) = normal deviate + 5 = a + 6x 


For each of these, a transformation of the dependent variate y may be 
made such that, if the value of the transform is plotted against the 
independent variate z, the points fall on a straight line, the slope and 
intercept of which give the parameters of the original function. The 
use of this scheme for fitting curves graphically is as old as the hills; 
before I studied statistics formally, I thought that what was meant by 
curve fitting was this procedure. However, a fundamental departure 
from this routine was made when Fisher® presented the method for 
dealing with the linear transform of the integrated normal curve in such 
a way as to provide an exact maximum likelihood solution in terms of 
the original binomially distributed observations. Since then the method 
has been extended to other functions than the integrated normal curve, 
other variation than binomial, and other criteria of fit than maximum 
likelihood.’ I have published some graph papers which are designed 
for the easy use of the linear transform method of plotting and fitting 
these curves; some examples are shown in figure 1.* The dependent 
variate is plotted directly without consultation of tables, by way of the 
left ordinate scale, against the dosage scaled on the abscissa, and the 
transform value itself can be read on the ordinate scale on the right. 
Since much work is done in terms of the independent variate measured 
logarithmically, papers with a variable logarithmic range as abscissa 
are being published; the one for the logistic function is illustrated in the 
figure. 

The title reads ‘“‘Why I Prefer’—the personal pronoun and the 
element of taste are to be emphasized; there is not going to be here any 
proof that one curve is “correct” or “fundamental” and the other false, 
and I am not attempting to lay down any fixed rules of behavior for 
every statistician. At one time I used probits myself—not by that 
name—but I did use so-called “probability paper,” and I also utilized 
the normal deviate as listed in Pearson’s tables,"” volume 1, for situa- 


*The cost of publication of these papers, as well as several other similar ones, which have been 
issued with the advice of a number of mathematical statisticians, was borne by the Mayo Foundation 
for Medical Education and Research, University of Minnesota. The papers are sold without profit 
to the origirators, by the Codex Book Company, Norwood, Massachusetts. id 
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tions requiring a symmetrical sigmoidal curve. I have tried to estimate 
the date when I went over to the logistic curve. I believe it was in 
1925, but I know precisely the occasion for my change. I was attending 
statistical courses in Baltimore. You will remember that the Pearl- 
Reed curve of population growth was called by the authors, the “‘lo- 
gistic” curve,” because Verhulst, who, they discovered, had used it 
before them, had called it that, and we learned much about this function 
curve in class. I was myself interested in experimental laboratory data, 
for it was in the experimental laboratory that my work had been done, 
and I was very much impressed by the physical meaningfulness of the 
differential equation of the logistic function, which is: 


= — 9) aa) 


If y is the “mass” of the observed quantity, it is seen that its rate ot 
change with respect to z is proportional to that mass, as for the expo- 
nential function, but in the present situation it is also proportional to a 
factor which decreases as y increases. This is a dynamic equation of 
mass action, the fundamental principle of physical interaction, and I 
recognized that it was the established physical law of a number of 
reactions, taking different forms according to the different physical 
meanings of y and the rate constant 8. I then applied it with great 
success in a number of experimental fields in which I was working. 

In 1929 Professor Reed and I”* wrote an article illustrating a number 
of these applications, some of which are shown in figure 2. The first 
part of the figure (2a) represents the phenomenon of autocatalysis, in 
the present instance the hydrolysis of ethyl acetate by the catalytic 
action of acetic acid, to form ethyl alcohol and acetic acid. Since one 
of the products formed, acetic acid, is the catalytic substance, we have 
“autocatalysis.” Here is illustrated a direct application of the differ- 
ential equation (1), for the rate of increase of the acid, which is the 
same as the rate of decrease of the acetate from which it is formed, is 
proportional to the product of the two masses concerned, the ethyl 
acetate and the acetic acid. If we call the original amount of ethyl 
acetate unity, then since the acid y is formed directly from the acetate, 
the amount of acetate at any time is (1 — y) which, multiplied by the 
amount of acid y, gives the product mass to which the rate of reaction 
is proportional. It is from this application that the logistic function 
obtains the name sometimes applied to it, “the curve of autocatalysis.” 

The next application, illustrated in figure 2b, is, by design, a very 
different one, the equation giving the electrode potential of an oxidation- 


- reduction reaction, in this instance a mixture of 1-naphthol-2-sulfonate 
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indophenol reduced by titanous chloride. The equation governing the 
electrode potential is given as follows: 
E, - nF In 

in which E, is the electrode potential measured against a hydrogen 
electrode, Ej is a constant characteristic of the particular system, R 
T and F are respectively the gas constant, absolute temperature and 
faradays, n is the number of molecules involved, S, is the concentration 
of the reductant, S, is the concentration of the oxidant. On the face 
of it, this equation is not logistic. But by appropriate assembling of 
terms, the details of which are to be found in the article referred to, it 
is recognizable as the logistic function, and an appropriate linear trans- 
form is shown plotted in figure 2b to illustrate the fit and to determine 
one of the parameters. I believe this method is now used in many 
chemical laboratories for estimating the parameters. 

- The next part of the figure, 2c, illustrates the bi-molecular reaction, 
a textbook example of the reaction between methyl bromide and 
sodium thiosulfate. The next part of the figure, 2d, is an example of 
application to an enzyme reaction, the hydrolysis of sucrose by in- 
vertase, from a paper by Berkson and Hollander’ to illustrate not only 
the extreme exactitude of the fit, but the direct interpretation shown 
in the succeeding part of the figure, 2e, of the parameter 7* as a measure 
of intrinsic rate of reaction. Ther parameter, determined from separate 
reactions using different concentrations of the enzyme, is shown plotted 
against these concentrations. The rate of reaction is known frequently 
to be proportional to concentration, and it is to be noted in this example 
that the proportionality is very precise, with a zero rate indicated for a 
zero concentration, as is rationally to be anticipated. 

-The last example shown, figure 2f, refers to the hemolysis of ery- 
throcytes by a hemolysin, in this case sodium hydroxide. Since ery- 
throcytes are biologic organisms and what is observed is the fraction 
hemolyzed in relation to increasing concentration of hemolysin, this is 
a bio-assay experiment with quantal response. The data are from an 
illustration of Von Krogh’* who, on the basis of considerations of the 
laws of surface tension and other physical factors, elaborated what is 
sometimes referred to as Von Krogh’s law of hemolysis. Von Krogh’s 
theoretically derived equation is, in fact, the logarithmic logistic. The 
physical theory of the logistic representation of hemolysis which I prefer, 
is not Von Krogh’s but a variety of autocatalysis. However, I am. 


*Represented elsewhere in the text discussion as 8, 
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opposed to replacing Von Krogh’s equation by the integrated normal 
curve, which has been advanced with the argument that we should con- 
sider the phenomenon to be a reflection of a distribution among the 
corpuscles of tolerances to hemolysis. 

One could easily cite many more examples, but those given will 
suffice to convey the point I desire to make. It is not that the logistic 
function is necessarily the physical law of all sigmoidally represented 
phenomena, although at least in the first three instances referred to 
the logistic equation is universally accepted as the pertinent physical 
law, and the others have wide application. It is rather that the logistic 
function refers to a wide range of phenomena, the intimate physical 
mechanisms of which are different in different cases. However, all of 
these mechanisms are dynamic and none of them is conceived of as 
referring to a. static distribution of tolerances. I think it is not in- 
accurate to generalize them as the result of the interaction of partici- 
pating masses obeying the mass-action principle. This, briefly, is 
what I should characterize as the general theory of the logistic. 

Now what is the theory of probits? I shall quote Finney’s® ex- 
planatory introduction in his textbook. - 

“For quantal response data it is therefore necessary to consider the 
distribution of tolerances over the population studied. If the dose, or 
intensity of the stimulus, is measured by \, the distribution of tolerances 
may be expressed by dP = f(A) da; this equation states that a pro- 
portion, dP, of the whole population consists of individuals whose 
tolerance lie between \ and A + dd --- . If a dose A, is given to the 
whole population, all individuals will respond whose tolerances are less 
than \, , and the proportion of these is P, where P = fo*f (A) da.” 

Here, as you see, the theory is that the mechanism resides in a vital 
property of the organism acted on, referred to as “tolerance,” which 
differs from animal to animal; and Finney refers, throughout his text, 
to the parameters of the function as being measures of the frequency 
distribution of these tolerances. Even more explicitly Withell,”® in an 
article advancing the thesis that the logarithmic death times of bacteria 
are normally distributed, one of the variations of the probit theory, says 
in explanation: 

“The obvious alternative to the monomolecular theory is to say that 
the length of time an organism can survive in a bactericide is pro- 
portional to its resistance. This is an assumption, but an assumption 
so simple that it makes no reference to any of the complicated reactions 
both physical and chemical which may occur between the organism and 
reagent.’’ 

Here we see the point of view unambiguously stated, and the theory 
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is frequently advanced with a didacticism and sense of confidence which 
imply that it is virtually a truism. Finney’ has written: “Berkson (1944) 
advocated using the logistic curve instead of the normal sigmoid in the 
analysis of data from quantal assays; he argued, ‘However the logistic 
function is very close to the integrated normal curve, it applies to a 
wide range of physicochemical phenomena and therefore may have a 
better theoretic basis than the integrated normal curve.’ This state- 
ment seems to be based on a misconception, since the reason for intro- 
ducing the transformation specified by equation (51) is not directly 
the shape of the normal sigmoid, but the derivation of the transforma- 
tion from a normal distribution of tolerance as measured on the z- 
scale.” There was, in fact, not any misconception. Even with a clear 
understanding of this hypothetical formulation, it is permissible to 
suggest that it is not necessarily the last word on the subject. If it is 
seriously believed that there is some physical property more or less 
stably characterizing each organism, which determines whether or not 
it succumbs, then it is justifiable to advance the hypothesis of a distri- 
bution of tolerances. In that case one should be prepared to suggest 
the nature of this characteristic so that the hypothesis may be capable 
of corroboration by independent experiments. If on the other hand the 
formulation is only that of a “mathematical model,” to guide the 
method of calculation, then it would seem more objective and heu- 
ristically sounder not to create any hypothetical tolerances, but merely 
to postulate that the proportion of organisms affected follows the inte- 
grated normal function. I am interested in the slope of the dosage 
mortality line as a “rate,” of the objectively observed increase of 
mortality with increase of dosage, not as a standard deviation of hy- 
pothetical tolerances of the animals. I should of course be very much 
interested in the last, if tolerance of the animals is what I was observing 
and studying. But we are not dealing with measured tolerances, we 
are dealing with a dosage mortality curve, and when my probitistic 
friends present a standard deviation of tolerances, they may be asserting 
a substantial quantity for the variability of something that in fact does 
not exist at all. I once had a teacher of philosophy who employed the 
Socratic method in class. When a student gave a simple and especially 
plausible explanation of a very complex social phenomenon the professor 
said, ‘Please, sir, do not make up history.” I should like to ask mathe- 
matical statisticians when they formulate mathematical models to please 
not make up physics. 

The practice of injecting an interpretation of “tolerance” into re- 
sponse data is objectionable not only on philosophical grounds; it can 
be misleading and harmful. During the war, when I was statistician 
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in the Office of the Air Surgeon, there came to my attention a great 
deal of data collected in air-chamber experiments in which pilots had 
been subjected to high-altitude conditions, and their all-or-none re- 
sponse of getting the “bends’’ was noted. Those who succumbed were 
tagged as having low tolerance and not to be assigned to high-altitude 
flying. Then, at my suggestion, the experiments were repeated. Many 
individuals who had succumbed the first time did not succumb the 
second time, and many who did not succumb the first time did succumb 
the second time. I suppose one could say that the tolerance of these 
individuals had changed, but then again maybe it was only the weather. 

If we note that the probability of response is a steadily increasing 
function of dose, is it necessarily true, as Finney implies, that under- 
lying this observation is a distribution of tolerances in the organisms 
acted on? I-have on my desk a report of a bio-assay experiment, in 
which drosophilae are exposed to increasing doses of x-ray intensity, 
and the quantal response—the percentage showing mutation—is ob- 
served.” According to Finney’s view we should assume that the differ- 
ence of per cent response between one dose and the next higher dose 
reflects the frequency differential of tolerances of the flies in that dose 
interval. That different varieties of Drosophila are differently sensitive 
to forces effecting mutation is well known, but so far as any specific 
experiment is concerned, the generally accepted theory is that the 
probability of an effective hit by a photon—which is what produces the 
mutation—is proportional to intensity of the radiation and is not 
effectively related to anything that can be intelligibly referred to as 
tolerance of the fly. If an arrangement were made by which marksmen 
shot at a target repeatedly until the bull’s-eye was struck, and it 
turned out that the number of shots required to hit the mark was dis- 
tributed in cocked-hat fashion, your thorough-going probitist, inspect- 
ing the table of results, would be likely to say: “Evidently the resistance 
of the targets to getting hit in the bull’s-eye is logarithmic normally 
distributed, and we should use probits.” This, without much exaggera- 
tion, is the theory which a recent critique, evidently in reflection of the 
climate of opinion prevailing in many statistical circles, characterizes 
as possessing “superior theoretical foundations.”’ I cannot agree; the 
theory that consists in always attributing an animal’s death or survival 
to the absence or presence of resistance, is reminiscent of Moliére’s 
reference to the physician’s explanation of the action of opium. “Opium 
puts a person to sleep,” said Moliére’s doctor, “because it contains a 
dormitive principle.” 

If it is insisted on that the model must be that of a distribution of 
tolerances, it is still possible to consider that the logistic curve repre- 


1 
vig 
a 
J 
| 
4 
oh 
Sig 


LOGITS TO PROBITS 337 


sents the cumulative frequency of. these. There is no certainty or 
heaven-ordained law that all variates distributed in a symmetrical 
cocked-hat manner follow the gaussian-normal curve. Those who cite 
the limit theorems of mathematical statistics which show that functions 
of observations tend to the gaussian-normal curve with increase of 
numbers should remember that these do not apply to all distribution 
forms, and not even to all statistics of the gaussian distribution itself. 
Gumbel noted,” in developing the distribution of the midrange: “The 
asymptotic distribution of the midrange for a symmetrical initial dis- 
tribution is symmetrical but not normal. This may be of interest since 
it refutes the widespread opinion that all measures of central tendency 
converge towards normality.” The symmetrical distribution to which 
Gumbel refers is, interestingly enough, the logistic. 

In many, perhaps most, bio-assay situations, it is really gratuitous 
to speak of a theory of the observed phenomena, in any serious literal 
sense. The factors involved are so many and the whole is so variable 
and complex that the function employed is best considered merely as an 
empirical curve, employed for the succinct statistical reduction of the 
data, of great descriptive utility, but without specific theoretical sig- 
nificance. Considered this way, it is appropriate to compare the em- 
pirical practical aspects of utilizing the two curves. Pictorially the 
curves are strikingly similar, so much so that they are practically super- 
posable, but their respective equations make one easier to handle than 
the other. 

In the first place, to evaluate the logistic curve we need no special 
tables besides a table of logarithms, for the logistic is only a slightly 
modified exponential function, whereas the cumulative normal, being 
an integral equation, necessarily requires special tables. Values of 
logits, anti-logits and logit weighting functions are all available in 
tabular and nomographic forms, but “available” here is used in the 
sense employed in statistical literature, that is, available somewhere. 
- Some have been published, all are in my office,* but I confess that 
almost as frequently as not, I do not use them. I reach for the four- 
place logarithm table in the Rubber'® handbook, or even only for a 
slide rule. For very careful work, on the other hand, as in sampling 
experiments, I can obtain logarithm or exponential tables with as many 
places as needed, whereas I have yet to find out whether the most 
elaborate tables of the normal function available, the WPA tables, are 
good enough. 

The logistic curve is distinctly easier to fit by various methods 


*Copies of these tables may be had by writing to me addressed to the Mayo Clinic, Rochester, 
Minnesota. 
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than the integrated normal curve, and.a special facility is present when 
the variate is considered to be distributed binomially, as in bio-assay 
with quantal response. This arises from the fact that if the variate p 
is logistic, the first derivative is pg. Since the weight in fitting pro- 
cedures is reciprocal to the binomial variance pq, these will frequently 
cancel. Thus the normal equations for the maximum likelihood esti- 


. mate of the logistic function are: 
nlp p) = 0 
Li — pz) = 0 


It is seen that the coefficients are dependent only on n. The normal 
equations for the integrated normal curve are: 


Pq 


Pq 


Here the coefficients contain not only 2, the ordinate of the normal 
curve, but pq, the values to be estimated. 

For the last several years I have been investigating, by sampling 
experiments, the relative precision of minimum X? and maximum likeli- 
hood estimates of regression coefficients, and these experiments have 
turned up some findings‘ that appear to give the logistic function a 
definite advantage over the integrated normal curve as a statistical 
instrument. The computers report that they can accomplish thirty 
definitive minimum logit X’ solutions by the method I have advanced’ 
in the time required to accomplish one definitive maximum likelihood 
probit solution. The logistic estimates—both the X’ and likelihood 
estimates—besides being asymptotically efficient are also sufficient, 
while estimates of the parameters of the integrated normal curve are not 
sufficient. Mathematicians appear to be of the opinion that such 
sufficient statistics are just about the best estimates that can be had.* 
Finally the minimum X? estimates have smaller sampling error than 

I do not mean to give the impression that I hold that the logistic 


*Fisher says in his well-known text, “There is, however, one class of statistics, including some of 
the most frequently recurring examples, which is of theoretical interest for possessing the remarkable 
property that, even in small samples, a statistic of this class alone includes the whole of the televant 
information which the observations contain. Such statistics are distinguished by the term ‘sufficient’ 
and in the use of small samples, sufficient statistics, when they exist, are definitely superior to other 
efficient statistics.” 
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curve is mandatory as against the integrated normal curve for bio-assay. 
So far as results are concerned, say in estimating the L.D. 50 or other 
percentage points, there generally will be slight if any perceptible 
difference. In situations such as hemolysis, there are questions of phys- 
ical principles and actual physical faets involved, and here I think the 
replacement of Von Krogh’s law of hemolysis, which is the logistic func- 
tion, by the normal curve would be a serious scientific setback. But in 
more general situations, individuals may well decide to use what they 
are accustomed to, and since a very effective educational campaign has 
been conducted on the use of probits, and they have been adopted in 
many places as a routine procedure, doubtless they will continue to 
have wide and useful application. 
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PART I: A CRITICAL EVALUATION 
I. INTRODUCTION 


N THE PAST accidents were attributed to bad luck or the visitations 
of some unpropitious deity. Today they are regarded as a social 
problem. Several factors have combined to reorientate our thinking in 
this regard, and it is appropriate that the study of accidents should be 
made against the background of significant changes which have taken 
place in our mental attitudes during the past quarter of a century. 
The more noteworthy of these may be summarised as follows: 


(a) There has been a progressive realisation in industry that efficient 
and not cheap labour is the most economical. This principle is still 
more acceptable in theory than in practice in many countries where it 
is applied only on a restricted basis. 


(b) With this realisation has come the concept of social medicine with 
its insistence on full and promotive health which is considered as far 
more important than the mere absence .of disease, and which is inter- 
ested in the maintenance of maximum as opposed to sub-health, in order 
to ensure maximum operating efficiency. 

It is perhaps rather ironical, that while this principle was preached 
before World War II, it should only have won wide acceptance at a 
time when the shortage of man-power, and the need to increase indi- 
vidual productive capacity are our most urgent national problems. 
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(c) Side by side with this concept has come the victory of the psycho- 
logical sciences with their doctrine that full health has its mental as well 
as physical components. This has led to the psycho-somatic approach 
to medicine. 
It is natural, therefore, that industrialists should be driven more to 


seek the advice of psychologists, psychiatrists and sociologists as well 
as that of the general medical practitioner. 


(d) Finally “many circumstances have conspired to insist on the funda- 
mental human rights and human needs of every worker” (18)—not 
merely as the acquisitive tenets of a convenient political ideology, but 
rather as the essential basis for the promotion of human health and well- 
being. There is, of course, considerable divergence of opinion as to 
what are man’s legitimate needs. 


In view of the above, it is not surprising that our interest in the prob- 
lem of accidents in industry should have increased. Politicians, indus- 
trialists and laymen in general are becoming more aware of the fact that 
considerable erosion takes place each year in our human resources due 
to this factor. As a result both public and private bodies have been at 
great pains to accumulate a mass of statistics reflecting the frequency and 
severity of these occurrences in our everyday life. Furthermore, several 
attempts have been made to estimate the financial loss involved in 
terms of medical aid and compensation paid, lost time, material Soman, 
hidden losses ete. (1), (2), (3), (31). 

These attempts are to be commended more for the attention “hiiidy 
they draw to these losses than for their accuracy of estimation. The 
task is not an easy one, for costs must vary considerably under different 
circumstances. Thus Moore (13) asks, ‘‘How can anyone equate the 
value of 24,000 lives in terms of cash equivalents; and how tell the after 
effects on the individual of an impaired part of the body?” The de- 
creased rate of production, the loss of efficiency and quality of work, 
the disturbed emotional states, and consequent reduction in morale 
and spreading of unrest and disaffection are all types of minor costs 
which must be taken into consideration, but which defy financial assess- 
ment. Some idea of the cost of these factors to the South African 
community can be given by reference to the report of the Workmen’s 
Compensation Commissioner. In 1947, £780,705 in direct compensa- 
tion and medical aid were paid out in respect of 145,094 major accidents 
reportable under the act. Heinrich (3) has estimated that the hidden 
losses of accidents are from 4 to 4} times the amounts paid out in 
compensation and medical aid. This would account |for another 
£3,122,820, making a total of +£4,000,000 for major accidents alone. 
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Although this study is primarily concerned with industrial accidents, 
it is as well to remember the increasing toll of human lives and happiness 
which is taken by road accidents each year. This is clearly illustrated 
in South Africa in the Annual Report of the Chairman of the District 
Road Safety Association, and the Vice-Chairman of the Executive of 
the Transvaal Road Safety Committee which was presented at the first 
General Meeting of the Transvaal Road Safety Committee held in 
Johannesburg on the 8th and 9th May, 1950. It was reported that, 
“Between the years 1940 and 1944 South Africa suffered 23,300 war 
casualties compared to 54,987 casualties suffered on the roads of the 
Union’’. 

The wider acceptance of psychology as a science, and the conse- 
quent development of psycho-somatic medicine have led to a deeper 
appreciation of the possible causes of accidents. These occurrences are 
no longer regarded as inescapable by-products of our mechanised age 
which have to be accepted in a spirit of fatalism. As man has become 
more willing to apply scientific techniques to the study of his own be- 
haviour, his attitude to this phenomenon has undergone considerable 
change. Accidents are no longer regarded as entirely fortuitous events 
and the inevitable price to be paid for technological advancement. 
Events which were previously considered to be chance-determined are 
now regarded as preventible, and causes which were hitherto regarde:! 
as beyond the control of the individual are now seen in many cases :'s 
intimately related to his psycho-physiological make-up. It is not « 
question of the blame being shifted from the environment to the ind - 
vidual, but rather an appreciation that what really matters is the degre = 
of adjustment which exists between the two. Our appreciation of th. 
wide range of individual differences which exist in man has led to th: 
natural conclusion that considerable improvement can be affected i:. 
human adjustment by a more careful consideration of those aspects ci 
the environment which are man-made, and also the varying degrees c: 
skill, mental ability, physical constitution, temperamental and per- 
sonality qualities with which individuals are equipped. As a resu): 
accidents are today more often regarded as problems of human adjust- 
ment, or as manifestations of maladjustment. 

It is appropriate that we should examine some of the evidence in 
support of this view. It is customary among investigators to divide the 
causes of accidents into two classes. (a) Impersonal, with the primary 
cause in the environmental situation, and (b) Personal, with the primary 
cause to be found in the individual. 

This division is in many senses arbitrary, and the frequency with 
which these two sets of causes operate, will be found to vary considerably. 
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Nevertheless, there is general agreement among investigators that by 
far the majority of accidents are to be attributed to personal factors. 
Thus as far back as 1918 H.M. Safety Committee on Workshops and 
Factories reported that “more than two-thirds of such accidents are due 
to other (i.e. non-mechanical) causes”. The findings of other investi- 
gators such as Drabs (5) and Kohler (6) support this conclusion. These 
findings suggest that accidents are overwhelmingly the result of the 
inadequacy of the human-being to adjust to the particular set of cireum- 
stances in which he is operating. | 

In view of the fact that accidents are to a large extent associated 
with occupational maladjustment, and that preventive measures cannot 
afford to ignore this aspect of human behaviour, it is important that 
our existing psychological knowledge on the subject should be clearly 
defined. 

Ever since it was realised that personal factors play a great part in 
causing accidents, there has been no dearth of protagonists advocating 
various ideas and remedies. Some of these are sound ard well-sub- 
stantiated by research findings. Others are rather shaky and arise out 
of investigations of a doubtful and inconclusive nature; and still others 
are merely the result of arm chair speculations which at times reveal 
great perspicacity of thought, but more often reflect merely the per- 
sonal prejudices of the thinker. Nowhere has one to be more careful 
in the evaluation of the results of investigation than in this field. Data 
associated with this problem are extraordinarily compote and the 
inter-relationship between causes so intricate, that it is difficult to set 
up adequate conditions of experimental control, and to isolate the part 
played by the various determining causes. Investigators in the past 
have been far too ready to accept “bold assumptions’’, and to be content 
with inadequately controlled situations and rather shaky techniques 
for analysis. The student is therefore encouraged to adopt a highly 
critical attitude to work in this field, and only to accept findings which 
have been submitted to the most thorough-going scrutiny, particularly 
the more dogmatic and sweeping statements referring to that ill-defined 
and rather cryptic phenomenon called “accident-proneness”. It will 
be shown later in this study that it is a very unfortunate but real fact 
that our knowledge of this concept has hardly proceeded further, and 
in some respects has suffered a reverse from the time when Greenwood, 
Woods, Yule and Newbold undertook their classic studies ' 1919 and 
1926. 

Before proceeding to this aspect of the problem, it is necessary to 
clear the ground by considering the relationship between various per- 
sonal factors and accident frequency. 
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Just as different environmental circumstances influence the liability 
of individuals to sustain accidents because of the inherent dangers 
contained therein, so different psycho-physiological qualities may affect 
the liability to accidents of individuals working in similar environmental 
situations. 

The more important of these can be listed as follows: 

(1) Health. It is a well known fact that many physical defects can 
place individuals at a disadvantage in certain situations. Defects of the 
senses, of height, strength, neuro-muscular co-ordination, rapidity of 
reaction etc., may predispose individuals to accidents in occupations 
which make heavy demands upon these factors. It wouJd naturally be 
the concern of the medical officer in industry to take precautions against 
this by ensuring the correct placement of these individuals in relation 
to occupational hazards. It should be remembered also that though not 
suffering from permanent deficiencies, the physical state of any indi- 
vidual may suffer impairment from time to time during which his 
liability to accidents is probably increased. In this~regard one is 
naturally concerned with the general state of health where the purely 
physical components can never be entirely divorced from the psycho- 
logical. Even the onset of diseased conditions which are purely organic 
in origin, are not unaccompanied by their psychological counterpart, 
which, in the form of reduced resistance to fatigue, boredom, irritability 
etc., may have considerable bearing on the individual’s susceptibility 
to accidents. 

Thus the precise relationship between the physical health of the in- 
dividual and accidents is not easy to define. Newbold (45) obtained 
correlations between accidents and sickness—defined as the number of 
visits to the surgery exclusive of accidents—of the order of .3. However, 
sickness as defined in this manner may reflect nothing more than “a 
tendency to report sick at the slightest provocation, and may in many 
cases .. . merely represent a disinclination to work engendered by some 
other environmental maladjustment”’. In this case accidents and ‘sick- 
nesses’ may merely be symptoms of temperamental instability rather 
than physical ailments. Sicknesses involving lost time, are freer of this 
element of instability, and it is not surprising therefore that investigators 
(Newbold (45), Farmer and Chambers (32)) should have found insig- 
nificant correlations between these and accidents, particularly as illnesses 
of this type occur more among old people, who (as will be seen later) 
have fewer accidents than the young. 

Further studies, however, by Viteles (64) and Bingham and Slocombe 
(34) do indicate that there is very likely some relationship between 
physical fitness (particularly associated with blood pressure) and acci- 
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dents. Further intensive research is needed to elucidate) more ade- 
quately this relationship. 
(2) Age and Experience. The effects of these two factors on 
quency have usudlly been studied together because it is difficult to 
disentangle their influences, except under experimentally controlled con- 
ditions, which cannot always be applied because of the constitution of 
groups in the real industrial situations. 
Numerous studies have shown the influence of the above factors 
when taken separately on accident frequency in various types of in- 
dustry. (Metal Industries, Mines, Textile and Motor Transport—Ver- 
non (19)). Furthermore, Newbold (45), made an attempt to separate 
the influence of the two factors by the technique of partial correlations. 
She found that in the groups studied, when age was kept constant the 
association between length of service and accidents practically vanished, 
whereas when length of service was kept constant the relationship be- 
tween age and accidents remained almost undiminished. As Vernon (19) 
has pointed out, ‘the apparent lack of reduction with increasing ex- 
perience may have been due to the fact that the groups of workers 
studied contained very few new workers, for we have seen that it is they 
who are particularly responsible for high accident rates”, It would 
appear from the above that experience (per se) is important particularly 
in the early initial stages of employment, but that after a given period 
its influence becomes negligible (Humke (35)). 
(3) Alcohol. Vernon (19) states that ‘the consumption of alcoholic 
liquors may greatly increase accident liability, even if the quantity taken 
is insufficient to induce intoxication’. His contention is supported by 
conclusive evidence from a variety of sources. The reason for this is 
not hard to find, for the effect which alcohol has, even in relatively small 
doses, on speed of reaction, co-ordination, and capacity for accurate 
muscular responses has been indicated in a study by Dodge and Benedict 
(36). The effect of alcohol on other mental functions has also been well 
illustrated in a number of other studies (37), (38). This contradicts 
the illusion which many people have that “they can drive much better 
when they have had one or two”. 
(4) Fatigue. So much has been written on the subject of fatigue, show- 
ing the many factors which go to make up this condition, that it is 
difficult to decide under which heading it should more legitimately be 
included. It is doubtful whether fatigue in industry ever consists of 
some degree of pure physical exhaustion, unaccompanied by) the psycho- 
logical factors of ennui, boredom, discontent, irritability, bad morale, 
ete. The whole subject is really a study in itself. The most striking 
evidence of the influence of fatigue ‘physical and mental) is|revealed by 
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studying the frequency of accidents at different times of the work shift. 
The results of numerous investigations are already available, and much 
has been recorded in detail by a committee of the British Association 
(39). Nearly all the records show a rapid increase in the morning spells 
which reaches a peak an hour or two before the lunch break. The 
tendency in the afternoon is much the same except that the maximum 
is reached earlier. 

While these results would appear to show the undoubted influence 
of fatigue on accident frequency, they also throw some interesting light 
on the nature of fatigue itself. If physical exhaustion were the only 
factor involved, then surely the increase in accidents should show a 
steady rise, and not fall towards the end of each work spell. This would 
suggest the influence of factors other than fatigue—a conclusion which 
is supported by the work of Osborn and Vernon (15) relating to the 
frequency of accidents on night shift which showed an almost complete 
reversal of incidence to the day shift. Here the rate decreased steadily 
throughout the whole period of work. Vernon concludes ‘The output 
curve roughly resembled that observed during the day shift, though it is 
not possible to compare it closely owing.to the different systems of spells 
and breaks adopted. The lack of correspondence between accidents and 
output suggests that speed of production played no part in the causation 
of night shift accidents. More probably the speed of production effect 
was similar during both shifts, but there were other factors present 
which interfered with it .... It is supposed, therefore, that the changes 
in the mental state of the day shift workers augmented the tendency of 
the accidents to increase owing to increased speed of production, while 
those of the night shift acted in the reverse direction, and overpowered 
the speed of production effect’. In addition to mental state, it seems 
feasible that the behaviour and eating habits of the workers before 
coming on shift by day and by night respectively, may also have some- 
thing to do with the resulting accident incidence. There may well be 
a tendency for workers on day shift to start without an adequate 
morning meal. This is perhaps less likely when they start in the even- 
ings. This fact alone, however, would not entirely explain the falling 
off of accidents in the morning and afternoon spells. Interpretations of 
this kind can be varied considerably. They may perhaps all contain 
some truth but not the whole truth. They serve, however, to illustrate 
that the problem of industrial fatigue is a highly complicated one. 

The research results which are to be found in the literature showing 
the relationship between environmental factors and accidents appear to 
be fairly conclusive. It is imperative, however, that the results should 
be interpreted with considerable caution. It is a fallacy to assume that 
@ simple cause-and-effect, direct relationship exists between the en- 
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vironment and human behaviour, just as it does between t 
of a gas and temperature in Boyle’s Law. This assumpt 
ignores the intervening factor of ‘fatigue’ between the enviro 
response, and the rather complex psycho-physiological c 
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which combine to constitute this state. The fallacy of this assumption 
has been strikingly revealed in one of the most spectacular research 
experiments in human behaviour of our time. About sixteen years ago 
a study was commenced in the Hawthorne Plant of the Western Electric 
Company (40) the purpose of which was to find out the influence of 
various enrivonmental factors (illuminations, rest pauses, étc.) on. in- 
dustrial efficiency and output. The results were quite different from 
what was expected in terms of previous findings. A single example will 
suffice in this study. ‘In one experiment the workers were divided into 
two groups. One group called the Test Group, was to work under 
different illumination intensities. The other group, called the Control 
Group, was to work under an intensity of illumination as nearly constant 
as possible. During the first experiment the test group was submitted 
to three different intensities of illumination of increasing magnitude 24, 
26 and 70 foot candles. What were the results of this early experiment? 
Production increased in both rooms—in both the test group and the 
control group—and the rise in output was roughly of the same magnitude 
in both cases. In another experiment, the light under which the test 
group worked was decreased from 10 to 3 foot candles, while |the control 
group worked, as before, under a constant level of illumination in- 
tensity. In this case the output rate in the test group went up instead 
of down. It also went up in the control group”. 

Further experiments of this type were carried out on different en- 
vironmental factors in the Relay Assembly test room. The results were 
equally confusing. Roethlisberger’s (41) own comments are worth 
quoting: 

“What did the experimenters learn? Obviously, as Stuart Chase 
said, there was something ‘“‘screwy’’, but the experimenters were not 


quite sure who and what was screwy—they themselves, the 
the results. One thing was clear: the results were negative 
of a positive nature had been learned about the relation of i 
to industrial efficiency. If the results were taken at their fa 
would appear that there was no relation between illuminat 
dustrial efficiency. However, the investigators were not 
willing to draw this conclusion. They realised the difficult; 
for the effect of a single variable in a situation where there 
uncontrolled variables. 

“‘A few.of the tough-minded experimenters already were b 
suspect their basic ideas and assumptions with regard to hum 
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tion. It occurred to them that the trouble was not so much with the 
results or with the subjects as it was with their notion regarding the way 
their subjects were supposed to behave—the notion of a simple cause- 
and-effect, direct relationship between certain physical changes in the 
workers environment and the response of the workers to these changes. 
Such a notion completely ignored the human meaning of these changes 
to the people who were subjected to them. 

“In the illumination experiments, therefore, we have a classic exam- 
ple of trying to deal with a human situation in non-human terms. The 
experimenters had obtained no human data; they had been handling 
electric light bulbs and plotting average output curves. Hence their 
results had no human significance. That is why they seemed screwy. 
But we suggest here, however, that the results were not screwy, but the 
experimenters—a “screwy” person being by definition one who is not 
acting in accordance with the customary human values of the situation 
in which he finds himself.” 

There is a moral to this story as far as accidents are concerned. 

The state of fatigue (induced by environmental circumstances such 
as illumination, lack of rest pauses, food, poor atmospheric conditions 
etc.) is considered to have an important bearing on both output of work 
and accident rate. If, as has been shown by the Hawthorne Experiment, 
the psychological state of the individual can interfere with his fatigue, 
and so counteract the influence of environmental factors causing it, that 
the output rate does not follow the direct trend expected in simple cause 
and effect hypothesis, then it is surely safe to conclude that the same 
mental attitudes may well destroy the direct relationship between en- 
vironmental factors and accidents in many, if not all, cases. 

Furthermore, it must be remembered that in the previous investiga- 
tions, the effects between environmental factors and output and acci- 
dents were being studied without the subjects knowing that the study 
was in fact being made; (these merely consisted of the backroom analysis 
of existing records) whereas in the case of the Western Electric Company, 
experimental conditions were actually set up, and not only were the 
subjects cognisant of this, but were actually kept fully informed and 
taken into the confidence of the investigators. They thus reacted not 
only to the changing conditions in the physical environment, but also 
to the new interest being taken in them—new psychological factors were 
introduced into the situation and it was probably, though perhaps 
unconsciously, to these that the subjects were responding in part at 
any rate. 

It would appear safe, therefore, to conclude that: 

(a) there is a relationship between environmental factors and (i) out- 
put, and (ii) accidents, but, 
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--(b) that this relationship only obtains when the psychol 
vironment of the group is relatively stable. Disturb this, in| any way, 
(by conducting an experiment which the subjects know of, or show some 
interest in the situation), and the whole relationship is upset. This is 
important as far as management action is concerned, for in this case 
it will always be difficult to know whether the individuals are responding 
to the changes which have been made in the physical environment, or 
merely to the fact that management has shown some interest, which has 


ogical en- 


now altered the whole psychological atmosphere of the situation. 


_ It will always be impossible to predict whether the resulti 
will be desirable or not. The basic factor determining this 
relations which exist between ‘experimenter’ and “experime 
the interpretation which the latter put on to the actions of 
Only when there is mutual confidence between the two can 
reactions be anticipated. This fact may well explain some 
predictable results consequent upon many of the so-called 
measures initiated by management, where no cognisance is t 
prevailing psychological atmosphere. 


II. THE PHENOMENON OF ‘ACCIDENT-PRONENESS’ 


People first stumbled on this idea when it was observed th 
work-groups studied, a minority were responsible for the 1 


accidents. 
‘This spectacular observation led the unwary in the past t 


the concept of accident-proneness as a means of explanation. 


It is a difficult matter to define what is meant by thi 
evolve a sensible measure of whatever it indicates. 
meant to define some personal trait as opposed to some char: 
the environment, which predisposed some to have more acc 
others in work conditions where the risk of hazard was e 
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The term would appear to imply moreover that it is po 


sible either 


(a) to differentiate clearly between two classes of people—those who are 


accident-prone, and those who are not; or (b) at least to be 
the group in terms of the severity of their proneness. 

Even today, after the trenchant comments of writers lik 
Blum (42), the fallacy of this line of argument, based on the 
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distribution 


of accidents in a single period of observation, does not appear to have 
penetrated the mind of the layman, or even of many specialists working 
in the field of accident prevention. It is appropriate therefore, in our 
study of this phenomenon that we should drive this point home by 
considering from an actual example the extent to which the accident- 
proneness concept. enables one to differentiate between persons with the 
hope of producing beneficial results, - 
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The implied advantage of this theory is that after having spotted 
who the accident-prone cases are in a group, their removal from the 
group should result in a decrease in the relative frequency of accidents 
sustained by the remainder. This objective is laudable and obvious. If 
it cannot be achieved there is little point in our clinging to the concept 
of proneness. 

Kerrich in Part II exposes the fallacy of this line of reasoning, which 
is even more strikingly revealed by reference to another example from 
Adelstein’s (53) data, covering the records of 104 shunters with 3 years 
service. He reports the effect of removing cases with a high accident 
record in the first year of service as follows: 


The accident rates for the shunters who 
joined in 1944 and shunted for 3 years. 


Ist year 2nd year 3rd year 


Mean accident rate for 104 men 557 "(855 317 
After removing 10 men with highest rate : 
in Ist year, i.e. 94 remaining men .393 .361 .329 


In this case the annual accident rate in the second and third years 
actually went up a trifle after the removal of the 10 men who had had 
most accidents during the first year. 

Clearly evidence of this type indicates that our conception of the 
term accident-proneness stands in need of a critical examination from 
first principles. 

The significant history of the concept of accident-proneness goes 
back to 1919 when Greenwood and Woods (43), and Greenwood and 
Yule (44) made the first thorough-going approach to the analysis of 
accident statistics. Their findings and conclusions were later critically 
examined and extended by Newbold (45 and 46), in her classic contri- 
butions to the subject in 1926 and 1927. Despite the fact that these 
studies were made about a quarter of a century ago, they must still be 
regarded as almost complete summaries of our existing knowledge of this 
phenomenon, as very little real advance has been made since that 
time. Anyone wishing to understand the subject of proneness cannot 
avoid making a detailed and comprehensive study of these works, for 
in them he will find, not only the basic assumptions on which the concept 
of proneness depends, but all the essential warnings as to the limitations 
of these assumptions which have been so often disregarded ever since. 
It is an unfortunate fact that since these studies were made, it has heen 
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assumed in almost all the literature that the existence of accident- 
proneness is an established fact, and that it is a stable phenomenon in 
individual make-up which makes it worth our while attempting to pre- 
dict and to use in practical accident prevention measures. Thus 
Farmer and Chambers (who coined the term accident-proneness) state 
explicitly (47): 

“Previous statistical investigations have shown tha 
workers exposed to equal risks were unequal in their liabili 
accidents, and that this unequal liability was a relatively 
nomenon, manifesting itself in different periods of exp 
different kinds of accidents” . . . ete. 

This claim would certainly not have been made by Newbold, Green- 
wood and Yule without certain strict provisos, and it is here suggested 
that most of the confusion of thought which has occurred since 1926, 
has resulted from an overstatement of the claims made by these authors, 
and a disregard of their warnings. It would seem necessary, therefore, 
«iat we should clarify our thinking on this subject by reviewing the 
: ‘guments of Greenwood, Newbold and Yule, and also the results of 


t industrial 
ty to sustain 
stable phe- 
sure and in 


subsequent research to see how far one is justified in a 
concept of proneness, and what use, if any, can legitimatel 
it at present. 

What is meant by the term Accident-Proneness? Farmer 
ers (48) state: ‘The fact that one of the factors connected 
liability has been found to be a peculiarity of the individ 
to differentiate between “accident proneness” and “‘accide 
“Accident proneness” is a narrower term than “accident 
means a personal idiosyncrasy predisposing the individual 
it in a marked degree to a relatively high accident rat 
liability”’ includes all the factors determining accident r¢ 
proneness’’ refers only to those that are personal’. From 
is obvious that environmental factors plus the personal fact 
proneness in the individual determine the accident lial 
viduals in any given situation. It is however, concerning 
confusion has arisen. This term is widely used in current 
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literature and 


yet it is scarcely ever well defined. This is apparent even in the technical 


language of some of the authorities. Thus Vernon (19, 
states: 


“The accident-proneness of various individuals is not § 


but is liable to be affected by any and every change i 
condition. This condition is influenced by external chan 


p. 48) in 1936 


a fixed quality 
n their bodily 
es of environ- 


ment as well as by internal changes of physical and mental health”’. 


This statement clearly implies that accident-pronene 
attribute, whereas in 1939 the same author states (8, p. 2 
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“Accident liability is influenced by many other personal qualities 
besides inherent accident-proneness. It depends on general health . . . 
age and experience, fatigue etc... .” 

This second statement suggests that ‘inherent accident-proneness”’ 
is a stable invariable attribute of the individual (in much the same 
way as we regard his general mental ability, manual dexterity etc.) and 
that accident liability depends on: (a) Inherent accident-proneness. 
(b) Variations in personal health, age, experience, fatigue, ete. (c) The 
risks inherent in the environmental situation. 

Whether or not Vernon changed his mind between 1936 and 1939 
is beside the point for he certainly gave no indication that this was so 
in the latter work, and the reader is still left in a state of indecision. 
The point has been raised here not with the intention of splitting hairs, 
but by way of illustrating the lack of precision im our thinking on this 
subject. It is of vital importance, moreover, whether accident prone- 
ness is a stable or variable attribute, for surely there would be little 
point in attempting to devise means of measuring or assessing an un- 
stable phenomenon. The general ‘belief’ today is that accident-prone- 
ness is a fairly stable attribute. However, of equal importance is the 
question as to whether it is a general or specific factor. Thus it is con- 
ceivable that A may be more prone than B in situation X, but B more 
prone than A in situation Y. On the other hand, it may well be that 
no matter what the situation is A will always be more prone than B, 
although the liabilities of both will have changed by the same amount 
owing to the different risks in the situation itself. This latter view is 
one which is quite generally held, and the accident-prone individual is 
regarded as one who has many accidents at home, at work, on the public 
highways, and is in fact a sort of “calamity Joe” who is always ‘“‘coming 
unstuck”. There is, however, no proof for either of these two hypotheses 
and some pertinent comments have been made in this regard by Brown 
and Ghiselli (49). Again the matter is an important one for prediction 
and accident prevention policies. 

It is clear from the above that the term accident-proneness should 
be clearly defined before we attempt to make any practical use of it. A 
clear conception of its meaning can only be obtained by a close scrutiny 
of our existing knowledge and the methods of analysis which have given 
rise to it. These are, fortunately or otherwise, of a mathemetical and 
statistical nature. They cannot, however, be avoided for they involve 
basic assumptions which we cannot afford to ignore. Unless the student 
is prepared to study these carefully and comprehend their full implica- 
tions, he would be well advised not to make any use of the term accident- 
proneness at all until a new meaning is given to the word on the basis 


_of research findings resulting from a new approach. There is regrettably 
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a tendency among many psychologists today to forget very conveniently 
or unconsciously about the assumptions underlying all mathematical 
and statistical “proofs”, and the fact that it is never possible in any. 
science to prove a hypothesis. The result is that very often claims are 
overstated, tenuous relationships are magnified to causal relationships, 
and possibilities become certainties as one turns over the|pages of the 
literature. We can only avoid this slip-shod method of; thinking by 
sticking closely to the evidence before us. If the reader should therefore 
find the following considerations laborious or irksome, he should also 
by the same token abandon any claims to a complete understanding of 
the concept of accident-proneness. 

The existing theory of accident-proneness started from! a knowledge 
of the observed facts which consisted of a frequency distribution of the 
number of accidents occurring per individual within a given group in a 
given environmental situation, and in a given period of time. This 
merely recorded the number of individuals who had 0, 1,/2 ---  acci- 
dents in the given time period. The shape of the resulting histogram 
formed by arranging the data in this way appeared to| have regular 
features about it and suggested that there was some underlying principle 
which gave rise to it, and which could possibly be stated in some law 
of distribution. 

The procedure has been conveniently summarised by Vernon 
(19, p. 29) as follows: 

“The statistical study of the condition now widely known as accident- 
proneness was initiated by Greenwood and Woods in 1919, when they 
investigated the frequency with which accidents occurred in groups of 
munition women engaged on various machine operations required in the 
manufacture of shells. They pointed out that while many of the women 
suffered no accidents at all, others suffered once or twice, and a few 
of them more frequently. The distribution of the accidents incurred 
might be due to simple chance, in the same sense that |the chance of 
drawing e.g. the ace of spades from a well shuffled pack of cards, would 
be once in every 52 trials on an average. Or again the workers might 
all start equal, but an individual who suffered one accident by pure 
chance might in consequence have her probability of suffering further 
accidents increased or decreased. The pain and inconvenience incurred 
might make her more careful in the future, and so reduce her liability. 
On the other hand it might increase her nervousness, and thereby pre- 
dispose her to more accidents. Accidents distributed; on this basis 
may, therefore, be called biassed. Still again, we may suppose that all 
workers did not start equal, but that some were from the outset more 
liable to suffer casualties than others. The accidents would then be 
distributed on the basis of unequal liabilities.” 
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There were thus three clear hypotheses, and the technique was then 
to compare each of the theoretical distributions resulting from these 
hypotheses with the actual observed distribution, to see which approxi- 
mated most closely to the given facts. It was then concluded that the 
hypothesis underlying this distribution was the one to be used when 
explaining the observed facts. 

In the classical experiment the results were as follows: 


Theoretical Accident Frequency when 
No. of women distribution was determined by 
No. of accidents incurring 
incurred in 13 accidents. Hypothesis 
months. Observed Chance Biassed of unequal 
distribution. Hypothesis Hypothesis initial 
liability 
0 448 406 452 442 
1 132 189 117 140 
2 42 45 56 45 
3 21 7 : 18 14 
4 37 26 1 78.1 4? 23 5? 21 
5 2 0.1 1 2 
Total 648 648.1 648 648 


In this case the “unequal initial liability”’ hypothesis had the closest 
correspondence with the observed facts. 

In fairness to the original investigators it must be emphasised that 
they themselves pointed out that this line of argument involved many 
assumptions and pitfalls which precluded a statement of the case in 
as simple a manner as has been given above. Unfortunately, their 
warnings have been almost universally disregarded since then, and the 
proof of the existence of accident-proneness has more often been ac- 
cepted as a straightforward procedure. From time to time there have 
been isolated references to the unwarranted claims which have been 
made on the basis of this original work, but it would seem desirable that 
a complete restatement and revaluation of the original theory should 
be made, since our thinking today has drifted so far from the original 
moorings clearly stated and specified by Greenwood and Newbold. The 
implications of the three hypotheses will be considered in turn: 


(1) Hypothesis of Chance Distribution. This hypothesis assumes that 
accidents occur to individuals in the work group studied, by pure 
chance. If this is so, it follows that all the individuals are equally liable 
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to sustain accidents in that environment and that, (i) the environ- 
mental circumstances are homogeneous for all individuals; (ii) the indi- 


viduals are all homogeneous with regard to the personal (i.e. physical 


and psychological) qualities which render them liable to sustain acci- 


dents. 


If this hypothesis were true it was maintained that} the resulting 


distribution of accidents in the group would obey the law 


of distribution 


known as the Poisson Series. Kerrich in Part II has given a reasonable 


mathematical proof that if the hypothesis is true, then a 


Poisson distri- 


bution of accidents will follow. Now time brings changes in the en- 
vironment and also perhaps in the personal liability of individuals (due 
to factors of learning etc.), but Kerrich has shown that these changes 
will not affect the proof provided the environment and the personal liability 
change equally for all individuals. This point naturally raises many 
queries from the psychological point of view. These will] be considered 


later. Of far more concern to us is the fact that (as Kerri 


h emphasises) 


the converse of his proof does not necessarily hold true. This dis- 


advantage of inductive reasoning is a very real one i 


the practical 


situation where one can only start with the observed facts and work 
back to a convenient hypothesis. Thus we may establish the fact that 
our observed distribution of accidents obeys the Poisson|Law, but this 


does not necessarily mean (though it may well be so i 


many cases) 


that our hypothesis of homogeneity in the group is true. There may 
well be some other hypothesis which would give rise to - same result. 


This is clearly illustrated in the following diagram. 


Let H = the hypothesis of homogeneity (smaller circle)| giving rise to 


‘ Poisson Distribution 


A = other hypotheses giving rise to Poisson Distributions. 


H = hypothesis of non-homogeneity. 


P = hypotheses giving rise to the Poisson Distribution (larger 


circle). 


P= hypotheses giving rise to non-Poisson distributions. 
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The following statements can thus be made. 


(1) If H then P follows. 
(2) If H then P follows. 
but (3) If P then |H may follow, or 
A may follow. 


Consequently the smaller the area A, being the difference between the 
two circles, the more likely it is that H will follow in statement (3), 
but this is never known. It is generally accepted that if a Poisson 
Distribution is obtained in the data, then it is most likely that the 
hypothesis of homogeneity (i.e. equal individual liability to accidents in 


‘terms of both environmental and personal factors) is the explanation 


of the observed facts. This is, however, the most one can say, and is 
undoubtedly one of the reasons for Greenwood’s own reservations at the 
symposium held in 1949 (51). As evidence of the fact that the above 
considerations are not merely manifestations of the over-meticulous 
approach of the academically minded, it is appropriate to refer to 
Maritz’s (52) findings with regard to Adelstein’s (53) data. This covers 
the accident records of 122 shunters on the South African Railways 
over a period of 11 years. In this case the environmental work condition 
may be taken as nearly homogeneous as one could expect to obtain in 
actual practice. The results were briefly as follows: 

The data were split into two periods 1-5 years and 6-11 years. 
In each case the distribution of accidents was found to agree satis- 
factorily with the Poisson Distribution. It was not possible therefore 
to reject the hypothesis that the samples were drawn from a Poisson 
population of accidents, and in each case it might have seemed reason- 
able that the group should be regarded as homogeneous with regard to 
their personal liability especially as the exposure periods were consider- 
able. Contrary to expectation, however, it was found that when the 
two periods were combined, the distribution of accidents in the total 
period 1-11 years was no longer represented by the Poisson Law, but 
was well represented by another, the Negative Binomial. Furthermore, 
when the product-moment correlation was computed between the occur- 
rence of accidents in the two successive periods, it was found to be, r = 
-29 which refutes the previous conclusion of no-proneness within the 
group as derived from the x’-tests of the two observed Poisson uni- 
variate accident distributions. 

This example illustrates that a Poisson fit even over a period of 5 
years is not necessarily proof of the homogeneity of the group and that 
its non-homogeneity may in fact be obscured by the inadequacy of 
exposure period, and it is impossible to say a priori how long this ex- 
posure should be. 
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It is likely that a phenomenon similar to the one above 
an even shorter exposure period (18 months) in the data o 
Ghiselli (49). relating to “On board accidents” of motorm 
the way in which the results are reflected in their table di 
one to state this conclusion as a definite fact. However, the 
of inferences drawn from observed Poisson distributions has 
illustrated in both theory and fact, and indicates that the 
Greenwood and Newbold cannot be lightly disregarded. 
(2) Biassed Distribution. In her original article Newbold 
that Greenwood and Yule toyed with this hypothesis but : 
rejected it owing to its rather cumbersome mathematical c 

Since then it has never been seriously considered in tl 
and need not occupy much of our time here, although it 
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he literature, 
is as well to 


remember the principal reason for its rejection at the time, 
Two alternatives were considered by the earlier investigators, viz. 
While the initial liability of individuals to accidents is equal, having sus- 
tained one—(a) they are liable to have more in the future;|(b) they are 
liable to have less in the future. It is thus very difficult to incorporate 
both these aspects in any single theoretical distribution of events. A 
special case of (a) was considered later by Dr. Irwin (54) who states: 
“Though this is a possible hypothesis, I do not, in fact, think it is 
the right one. This interpretation is contradicted by the fact that, if 
it were true, the accident rates of the same individuals ought to increase 
with time—the longer they had been at their job, the bigger their 
accident rate ought on the average, to be, and this is opposed to actual 
observed experience.” 
We must conclude also that the other alternative (b) above should 
also be rejected by the same reasoning, mutatis mutandis. 
While the cogency of these arguments is appreciated, it 
theless permissable to point out that in both cases it is 
individuals behave in a rather mechanical fashion, and that, being pre- 
disposed to react in one direction, they must perforce continue reacting 
in that way as the result of some obsessive-compulsive |mechanism 
merely to suit the requirement of a mathematical series. This is clearly 
not usually the case. Furthermore, it is quite obvious that in many 
instances both alternatives (a) and (b) do operate in practice. If I 
have an accident, or near-accident in any given situation, the danger 
potential of which I have underestimated, then I take good care to 
avoid any bold or carefree response on subsequent occasions. Con- 
versely instances may well occur where the shock and neryous appre- 
hension of an accident may predispose the individual to 1 
control on subsequent occasions and sustain further injury. 
The fundamental idea that the occurrence of one accide 
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the probability of having the next seems to be an obvious one which 
should be associated with the theories of learning. This notion need 
not be avoided merely because of cumbersome mathematical conse- 
quences, since the theory of stochastic processes, which has developed 
and streamlined the arithmetic in many other fields, may possibly serve 
the same function in this, where the problem is essentially that of ‘events 
occurring in time’. Kerrich’s treatment of the contagious hypothesis 
in Part II should be borne in mind when considering this question. 

(3) Unequal Initial Liability—or Proneness. Whenever it has been 
found that the theoretical Poisson distribution did not fit the observed 
one, it has been customary to apply an alternative known as the Nega- 
- tive Binomial. Again, if the fit is good in this case it is usually contended 
that the hypothesis underlying the Negative Binomial must also explain 
the distribution of the observed facts. One of the hypotheses which 
would give rise to this type of distribution is that of unequal initial 
liability or proneness which resulted in the group being non-homogeneous 
in its liability to sustain accidents. It must be remembered that en- 
tirely different hypotheses may also give rise to the Negative Binomial 
(see Feller (74)). 

This latter proviso has been largely ignored in subsequent literature 
and this has led to a somewhat blind acceptance of the existence of the 
phenomenon of accident-proneness as an established fact. 

That this conclusion is unwarranted can only be really appreciated 
by reviewing the original argument of the authors, and following up the 
consequences of their conclusions. This may be briefly summarised as 
follows with as little reference to the mathematical details as possible: 

Let it be assumed that ‘n’ individuals are working in a homogeneous 
environment, and that these individuals are non-homogeneous with re- 
spect to their personal proneness to sustain accidents i.e. that they 
differ among themselves in this respect. 

It follows from these assumptions that the total group can then be 
divided into subgroups 1, 2, 3 --- k, each of which is homogeneous 
within itself but differing from all the others. In other words the 
liability is the same within each sub-group, but differs from group to 
group. There can be one or more people in each sub-group. 

Now, although the liability of individuals in each sub-group is the 
same, they will not all have exactly the same number of accidents in 
any given period of time, for these will be distributed by chance within 
each sub-group, some having 0, 1, 2. . . etc. By hypothesis then, 
(i.e. assuming homogeneity in each sub-group) the distribution of acci- 
dents within each sub-group will follow the Poisson Law, for Kerrich 
has shown that this will apply to any homogeneous group. 

It is then possible to calculate the mean number of accidents per 
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individual (A) in each sub-group, and this will represent the liability of 
individuals in each sub-group to sustain accidents. The position can 
roughly be summarised as follows: 


The accidents of sub-group 1 yield a Poisson Distribution wit 
The accidents of sub-group 2 yield a Poisson Distribution wit 
The accidents of sub-group 3 yield a Poisson Distribution wit 
The accidents of sub-group k yield a Poisson Distribution wi 


‘Each X will have a different value, indicating the differen 
or proneness of individuals in each of the sub-groups. 

It is assumed that the number of individuals in each sub}group will 
not be exactly the same but will vary hence, 


mean 


liabilities 


A certain number of individuals W, in sub-group 1 have mean A, . 
A certain number of individuals W, in sub-group 2 have mean ), . 
A certain number of individuals W, in sub-group k have mean ), . 


The distribution of \’s in terms of the number of individuals in each 
sub-group must then take some form such as: 
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The area of each bar represents the nuinber of individuals in each 
sub-group as shown by the respective W’s. 

The shape of this distribution can never be known because we never 
know which individuals should be grouped together as homogeneous 
with regard to their personal liability to sustain accidents. Newbold 
pointed out quite specifically in 1926 that Greenwood assumed that this 
distribution of \’s would take the form of a Pearson Type III (or a 
Chi-Squared) Distribution. And later we find Prof. Greenwood (55) 
stating: 


“Our solution of the problem of a priori differentiation (i.e. unequal 


- initial liability) was empirical, in the sense that our only justification of 


the particular choice of f(A) (i.e. the distribution of \’s) was that it 
ranged from A = 0 and led to a statistically useful form. We did not 
suggest that no law connecting A with r on the other hypothesis (i.e. 
Biassed distribution) would give an identical graduation’’. 

The chief point about this assumption is the fact that, if this is so, 
then it follows mathematically that, if a frequency distribution is made 
of the number of accidents sustained by the total group (i.e. irrespective 
of sub-groups), then this must take the form of a Negative Binomial, 
which can be regarded as the mathematical consequence of slumping 
together or the superimposition of a series of Poisson’s each of which 
applies to the sub-groups 1, 2, 3 --- k already mentioned. 

This fact is also true for environments and personal liabilities or \’s 
changing in time, provided the changes are equal for all individuals. 

The convenience of this mathematical model can now be illustrated. 
If the accidents of the total group are distributed in the form of a 
Negative Binomial then the properties of this distribution can be de- 
termined. And from these (the factorial moments) it is possible to com- 
pute the properties (moments) of the distribution of \’s which gave rise 
to the Negative Binomial observed. This then would give us a good 
deal of information concerning the distribution of proneness (as reflected 
by the distribution of \’s) in our total population—which is what we are 
after. We are, however, considering the special case where the distri- 
bution of \’s takes the form of a Pearson Type III, but in general this 
distribution can take any form, and it will still be possible to obtain 
some information regarding the characteristics of this distribution of 
d’s, by calculating the factorial moments of the observed accident dis- 
tribution of the total group. But all these calculations depend on the 
significant assumption that underlying any observed accident distribu- 
tion there does in fact exist a theoretical distribution of \’s. 

If a completely different hypothesis, (involving no assumptions re- 
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garding initial liability and consequent differences in 
result in,a Negative Binomial distribution, then the calculation of the 
properties of the distribution would have no meaning, and we should. 
in fact be calculating the properties of a distribution which did not 
exist. The whole point is thus to decide whether in any set of circum- 
stances one is justified in assuming that, given a Negative Binomial 
distribution of accidents for the total group (which is after all the only 
observable data ond can hope to get) this distribution must have arisen 
as the result of a Pearson Type III distribution of \’s. If, e.g. it can 
be shown that the Negative Binomial could have arisen as the result 
of some other hypothesis, then one’s confidence in this assumption must 
be shaken. 
Dr. Irwin (54) has in fact given two cases in which the Negative 
Binomial Distribution could arise in terms of different) hypotheses, 
neither of whieh makes any reference to the underlying distribution of 
\’s. Thus he states: 
- “An alterhative hypothesis made by Greenwood and 


) could also 


Kerrich’s elaboration of the “burnt-fingers” hypothesis 
illustration of this point. 

For reasons stated earlier in this study (see Section (b) |Biassed Dis- 
tribution above) he does not think this hypothesis is a likely one. He 
goes on, however, to show that there is also a third possibility due to 
Liiders (Biometrika XXVI pp. 108-128, 1934). 

Even if one agrees with the authorities that these alternate hypoth- 
eses are not as reasonable as the first, one’s confidence is somewhat 
affected by the realisation that sooner or later other alternative hypoth- 
eses may well be discovered to explain the Negative Binomial which are 
as reasonable, if not more so. It is also possible that another form of 
observed accident distribution which is neither Poisson nor Negative 
Binomial may be discovered which fits the data better than say the 
Negative Binomial. The mathematically minded student is invited to 
consider the possibilities of some of Neyman’s (56) contagious distribu- 
tions in this regard. David (57, p. 68) fitted Neyman’s type (A) con- 
tagious series to Greenwood and Yule’s original data and obtained a 
slightly better fit than the Negative Binomial. This can be seen from 
inspection: 


people start éut alike, but that the probability of a person having an — 
accident is altered if he has already sustained a previous accident. It a 
can be shown that the Negative Binomial may arise in this hypothesis Pe. 
also, if the law of change for the probability is suitably chosen. I shall bee 
use a method due to Kermack and McKendrick (Edin. Math. Journal | = 
1925-26, p. 98), though not specifically applied by them to accidents’. ae 
: 
| 
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No. of Observed Poisson Negative Neyman’s 
Accidents Frequencies | Distribution Binomial Series 
0 447 406 442 448 
1 132 189 140 128 
2 42 45 45 49 
3 21 7 14 16 
4 3 1 5 5 
5 2 0.1 2 1 
Total 647 648.1 648 647 


The author adds further to our distress by stating: 

“Tt is, however, difficult to see what the parameters of either dis- 
tribution mean”. 

Consider furthermore, the possibility that people may start with 
equal liability, but learn to adjust their behaviour at different rates. 
The idea is feasible but the resulting distribution is as yet unknown. 

One is forced to conclude, as in thé case of the equal chance hy- 
pothesis above, that if the assumption of non-homegeneity is true (i.e. 
given accident-proneness) then a non-Poisson distribution of observed 
accidents will follow, (which may be Negative Binomial). The converse 
of this is, however, not necessarily true. As Greenwood (51) states: 

“A Negative Binomial could arise in a great many ways, and if one 
had a Negative Binomial and it was a good fit, accident proneness might 
be involved or might not. If one had a desperately bad fit one could 
be fairly sure, not quite sure, that accident proneness had no part in the 
business’’. 

From the above it will be appreciated that the old approach to the 
study of ‘accident-proneness’ is sterile and effete. This is principally 
because it confined its reasoning to conclusions which could be drawn 
from fitting theoretical curves to observed univariate frequency distribu- 
tions. The restricted nature of this approach has caused us to get 
bogged down in the inconclusiveness of inductive reasoning, and has led 
us into a dilemma from which we are only slowly beginning to extricate 
ourselves. 

The position can be briefly summarised as follows: 


(a) A population can be considered homogeneous (H) with respect to 
(i) personal attributes, (ii) environmental factors. 

If a population is homogeneous (H) with respect to both (ji) and (ii), 
then a Poisson distribution of accidents will follow. (See Kerrich). 
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On the other hand, if the observed distribution is found t¢ 
then the population may well be H with respect, to both (i) 


it does not necessarily have to be so, for other, as yet untried, 


may well give rise to the same observations. 


(b) If a population is non-homogeneous (H) with ined 
or (ii), or both, then a Non-Poisson distribution of accident 
(which may be Negative Binomial). 

On the other hand, if the observed dintsitatinn | is fount 


363 
» be Poisson, 


and (ii); but 
hypotheses 


to either (i) 
ts will follow 


1 to be Non- 


Poisson, then this may well be explained by the non-hom¢ 
of the population with regard to either (i) or (ii), or both; b 


ogeneity (H) 
, again, this 


does not necessarily follow, for other hypotheses, (see aboye) may well 


give rise to the same results. 


(c) The dilemma is reached when it is realised that, eve 


if the hy- 


pothesis of non-homogeneity (H) is accepted as the true| explanation 
of the observed Non-Poisson frequency distribution, one still does not. 
know whether this is due to non-homogeneity (H) in either (i) or (ii), 


or both—and this is the heart of the problem, and the ca 

confusion and wishful-thinking today. For people have bi 
ready to assume that the Non-Poisson (which may be } 
nomial) observed frequency distribution means non-homo; 


se of all the 
een only too 
Negative Bi- 
peneity (H), 


and have then gone on glibly to attribute this to (i) personal attributes 


alone, in an attempt to shift the blame from the environ 
individual by calling people, and not work-places, accident: 
This step can only be taken provided the environmenta 


ment to the 
prone. 
1 factors are 


equal for all, and there can be as little justification for 
(except under the most artificially controlled conditions), 


suming this 
there is for 


assuming that people are homogencous with respect to their personal 
attributes. 

One can readily appreciate the need to determine whether the un- 
equal liability of people to accidents is to be attributed to either (i) 
personal or (ii) environmental causes, for this knowledge jwould be of 
vital importance for accident prevention measures. If e.g| it could be 
established that the non-homogeneity were with respect |to (ii) only, 
then remedies should be directed at inequalities in the environment. 
Furthermore, no remedies directed at (i), the personal factors, would 
have the slightest influence on the relative accident rates|of the indi- 
viduals, except, perhaps to reduce them all alike. This statement is 
equally true in the case of remedies relating to (ii) applied to a group 
which is non-homogencous with respect to (i). 

- As yet no strict definition has been evolved to define a 
which is homogeneous with respect to (i) or (ii) only. 


| population 
(Kerrich’s 
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equations 1.9 and 1.10 in Part II involve the assumption of homogeneity 
in both (i) and (ii). Consequently we cannot construct mathematical 
models of the accident distribution which would result from such an 
hypothesis, as distinct from those resulting from non-homogeneity in 
both. It is precisely because we cannot isolate and study the influence 
of these two sets of factors separately that the term ‘“‘accident-proneness”’ 
is today such a confused concept. 

The personal factor in proneness (which differentiates between indi- 
viduals) should be more properly based on an intra-personal study of 
individuals’ accident records, as suggested by Arbous and Sichel (59) 
in the field of absences. By so doing, it should be possible to determine 
what the individual’s distribution of accidents is in time, and whether 
it is Poisson. In this way one could test the hypothesis that the Negative 
Binomial distribution is in fact generated by the superimposition of 
individual Poissons with differing \’s. This would lead to the con- 
struction of a mathematical model from the groundfloor upwards and 
not from the roof downwards, as has been done previously, and might 
well avoid the complications attendant upon the latter approach. 


III. A MORE PRACTICAL APPROACH 


Irrespective of the underlying hypotheses (inaccessible and other- 
wise) which may give rise to the unequal distribution of accidents to 
a group of people in any observed period, considerable practical ad- 
vantage would be gained if one could predict in any subsequent period 
which individuals are in fact going to sustain most of the accidents. The 
practical man, despite all theoretical explanations, will in the end need 
this information as a basis for his future accident-prevention policy. 

The first line of attack in this approach is to attempt to predict 
the future events in terms of past events of the same kind. Further- 
more, if we assume that unequal initial liability (whether due to per- 
sonal or environmental causes or both) is responsible for the observed 
distribution in the first period, then, provided the personal factors and 
environments have remained constant (or changed equally for all), we 
expect that there should be a correlation between the individual’s acci- 
dent records of the observed and future periods. Thus, if unequal initia! 
liability is the correct underlying hypothesis, and gives rise to the 
Negative Binomial, as it often does, this form of distribution should be 
closely associated with correlation. This view is supported by Marit 
(52) and Lundberg (78). It is also the practical approach which has 
been adopted in a cognate field of study by Arbous and Sichel (59). 
when dealing with absenteeism data. It will be convenient at this stage 
to consider some of the advantages of this approach. 
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It will be appreciated that the interpretation of the product-moment 
correlation coefficient, r, resulting from discrete bivariate distributions 
(such as are associated with accidents and absences) presents some 
problem, and can only be defined as the slope of the standard regression 
lines. This difficulty is, however, no serious obstacle to this approach, 
for here we are primarily interested in seeing what practical use can 
be made of the type of bivariate distribution here being} dealth with. 
If some theoretical bivariate distribution (in this case the Negative 
Binomial) whose properties are known, can be fitted satisfactorily to 
the observed, then some important consequences follow. This was suc- 
cessfully demonstrated by Arbous and Sichel (59) in the case of ab- 
sences observed over 2 one-year periods where a coefficient (r) of .75 
was expected theoretically in terms of a formula derived from Lundberg 
(78), and subsequently verified as being .74 in a follow-up study. 

A close fit was also obtained between the observed bivariate dis- 
tribution and the theoretical Negative Binomial, as the result of which 
it was possible to work out the operating characteristics (cumulative 
frequency ogives) of the given correlation surface. 

On the basis of the observed data for the first year only, the following 
predictive statements were made: “ 

Out of every 100 persons in the population: (the figures given are 
approximated to the nearest whole number). 

(i) 19 will have 8 or more absences in the next year. (ii) If all those 
with 4 or more absences in the first year were subjected to remedial 
action, 50 cases would have to be dealt with. (iii) Of these 50 cases 18 
will have 8 or more absences in the next year i.e. 18 out of the total of 
19 absence-offenders will be brought within the purview of remedial 
measures. This means a +92% confidence that rh vom is being 


applied to the cases which in fact need it. (iv) Of the remaining 32 
cases (in the 50 selected for treatment), at least 3/4 or 24 will have an 
absence record which is at least worse than the average good attender— 
i.e. 4 or more absences in the next year. 

The above statements were all verified and were found to be almost 
exactly correct in a follow-up study of a sample of 248 cases. 

These statements are useful and concrete. They should be con- 
trasted with the usual vague concept “that the correlation coefficient, r, 
between events occurring in the two periods is .74’”’. The latter is only 
one conclusion, which, by itself, enables one to do very little. This 

_new approach has, moreover the advantage that one at least knows 
how inefficient one’s predictions are going to be. 

The practical advantage of being able to make similar predictive 
statements in the field of accidents is obvious. m 
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It must be remembered, however, that a correlation coefficient of 
some magnitude is an essential pre-requisite, particularly as the type 
of surface being dealt with is non-homoscedastic, and the confidence 
limits are even less satisfactory in the case of this type of distribution. 

With this end in view, let us now examine some of the results which 


-have so far been achieved by accident studies, and in doing so temper 


our enthusiasm with the realisation that as Maritz (52) has shown, even 
if the marginal distributions are Negative Binomial, it is still possible 
theoretically to have zero correlation. 

When considering the findings of the various research workers i in 
this field, it is essential to draw a distinction, despite its arbitrary 
nature, between major and minor accidents. The former consists of 
lost-time accidents which result in incapacity of the individual which 
is usually reportable under Workmen’s Compensation Legislation. The 
latter consists of small injuries such as minor bruises, cuts, burns etc., 
which are treated at the factory dressing station, after which the indi- 
vidual returns to work without further incapacity or loss of time. To 
draw a distinction between these types is imperative, because the minor 
accident may not only reflect the liability of individuals to these events, 
but also a tendency on the part of some to report to the dressing station 
while others do not, as has already been mentioned by Newbold in the 
case of minor sicknesses. It is virtually impossible to ensure absolute 
uniformity in the reporting of these occurrences. Major accidents are 
almost completely free of this spurious factor. 

In their original works Greenwood and Newbold were careful to 
define their data clearly and pointed out that the accidents investigated 
were chiefly of the minor type and some were even stated to be “trivial”. 
It is to be regretted that this rigorous approach has not been adopted 
by all subsequent investigators. Some even go so far as to talk of 
“real’’ (sic) accidents, without defining specifically what distinction is 
meant by this term. This makes it difficult to decide to what extent 
the results are a reflection of the spurious relationship already mentioned. 
We shall, therefore, confine ourselves to those investigations where we 
can be reasonably sure of the type of accident dealt with. 


(1) Correlations between Minor Accidents in two Successive Periods. In 
the following studies accidents were usually considered as any injury, 
however slight, which was treated at the ambulance room, factory 
dressing station or from first aid boxes. It is natural that these should 
include some major accidents. However, owing to the overwhelming 
predominance of small cuts, bruises, burns etc., the results must be 
regarded as due chiefly to the influence of these minor accidents. 
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(i) Newbold’s (46) results are as follows: 


- Lengths of Periods Observed 

Group No. of Correla- 

People Ist 2nd tion, r - 
D.I Females 19 2 yrs 5 mths 21 
D.II Females 42 2 yrs 5 mths 36 
E.I Males 445 lyr lyr. 57 
E.II Males 288 lyr lyr. 25 
E.III Males 226 1 yr. lyr 20 
E.IV Males 288 lyr lyr 62 
G.I Males 47 lyr. lyr. 36 
G.II Males 82 lyr lyr 57 
G.I Females 120 lyr. yr 53 
G.II Females 50 lyr lyr —.01 
H Females 227 6 mths. 6 mths. 05 


(ii) Greenwood (43) obtained correlations varying from .37 to .72 for 
various groups studied. (iii) Wong and Hobbs (58) obtained a co- 
efficient of .56 for their group of 290 male brewery workers over 2 four- 
weekly periods. (iv) Farmer and Chambers’ (60) results for 2 groups 
of R.A.F. and 3 groups of Dockyard Apprentices would appear to fall 
under the category of minor accidents for when comparing these with 
sickness they state ‘whether the correlation is due to a real relationship 
between ill-health and liability to sustain accidents, or whether it is 
due to a tendency to report both accidents and sicknesses cannot be 
determined in the present data”. | 

Their results over two successive periods of one year vary from .02 
to .57 with one instance of .84. Their mean weighted coefficient be- 
tween all periods is .358. 

In a subsequent report, (I.H.R.B. No. 68) the authors state: ‘The 
intercorrelations between different periods of exposure, given in Report 
No. 55, are higher . . ., the mean being .358, but this was obtained 
from groups that were not strictly homogeneous. Each |year’s entry 
into the dockyard was treated as a group (to equate experience pre- 
sumably) in spite of the fact that the members were employed in 
different trades, not allowing for this in the previous investigation would 
account for the greater. magnitude of the correlation coefficients be- 
tween successive years’ accidents for the members of the groups would 
be constantly exposed to unequal accident risks throughout the observed 
period... .” (v) Arbous and Sichel (59) obtained a co tion of .59 


{ 
| 
‘| 
‘ 
| 
| 
— 
Gone 


368 BIOMETRICs, 1951 


for a group of 318 steelworkers over two 6 months periods. in this 
study major accidents involving the loss of one or more complete shifts 
were specifically excluded. 

The above are fairly representative of the research findings in regar« 
to minor accidents. With rare exceptions the correlations are ali 
positive and appear to be significant, indicating that there is definitely 
some tendency for individuals to repeat their records as far as minor 
accidents are concerned. The fact still remains, however, that it i: 
impossible to say whether this reflects the different liability of indi- 
viduals to sustain accidents, or merely the artifact of a tendency on 
the part of some to report these occurrences, while others do not. T» 
answer this question conclusively would involve the setting up of ex- 
perimental conditions subjected to the most rigorous control and super- 
vision, if this were indeed possible. Until this has been done it is im- 
possible to say whether accident-proneness does in fact exist as far as 
minor accidents are concerned. 


(2) Correlations between major Accidents in two Successive periods. (') 
Farmer and Chambers’ (47) investigation among 4 groups of transport 
drivers would appear to fall under this heading: their data were supplie i 
by an insurance company and by a haulage firm. However, referring 
to group A the authors state: ‘The standard of reporting accidents 
was high, so that trivial accidents as well as serious appear in the 
records”. 

The inclusion of minor accidents may account for the higher corr:- 
lation coefficients obtained for this group. 

The results are as follows: 


Omnibus and Trolley Bus Drivers 
Correlation between 
Accidents in years Group A Group B Group C Group D 
1 and 2 182 235 .071 
1 and 3 .235 — .063 .058 
1 and 4 .177 .127 
1 and 5 .274 
2 and 3 .328 — .078 . 225 
2 and 4 .176 .195 .251 
2 and 5 — 
3 and 4 212 .016 296 
3 and 5 .273 
4and 5 ‘ 
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(ii) Adelstein’s (53) data, where major accidents were defi 


involving a loss of 7 days or more, yield a correlation of .29. 


in respect of 122 shunters over a period of 11 years. 
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ed as those 
9. This is 


Thus even if our basic assumptions are valid and our observed 
frequency distributions are to be explained in terms of unequal initial 


liability, the stability of the phenomenon of accident prone 
of the order of .2, or .3 at the most in the case of major acci 


coefficient increases somewhat when minor accidents are con 


when minor and major are taken together, but this rarely 
about .6, and in any case is suspect because of the spuri 


already mentioned. Furthermore, anyone who has had ex 
dealing with the correlation surface yielded by the Bivariat 
Binomial distribution, will immediately realise how inadeq 


efficient of even .6 is for prediction purposes. 

In fairness to the theory of proneness, however, it must 
out that as the successive exposure periods are increased, the 
coefficient will also increase, but it is likely (as Adelstein 


shown) that these will have to be so long (at least two p 


years) that there will be little point to the discovery of this 
after all the damage has been done. 


(3) Correlation between Minor Accidents and Major Accident 
Exposure Period. In view of the paucity of data when stu 

accidents alone it has been the hope of investigators that t 
minor accidents, which could be undertaken over shorter per 
would overcome this difficulty. It was felt that there 
difference between major and minor accidents and that t 
injury was the one chance element in the situation. Thes 
currences were therefore regarded as clues which could 

anticipate and prevent the major calamities, which after 

main concern of investigators. Those individuals who we 
minor accidents, were therefore those liable to sustain majo 


the future. On this basis an admirable instrument for predi 


be devised. This theory was tested by correlating the mino 
accidents of individuals in a given exposure period, for the 
prediction would naturally depend on the size of the r 
efficient. The results have been extremely discouraging. 

(i) Farmer and Chambers’ (32) results for a group of 14,5 
workers over a period of one year are as follows: 
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Correlation Coefficients 
Trade No. in 
Group Major and Major aud Minor 
Minor Exposure Constant 
Labourers 6,050 .138 .141 
Shipwrights 3,507 .170 172 
Boys. 429 . 252 . 249 
Rivetters 295 . 254 . 264 
Caulkers 181 311 .424 
Fitters 2,634 .055 
Drillers 539 — .039 
Smiths 196 — .022 _ 
Welders 54 — .004 _ 


(ii) Adelstein’s (53) result for a group of 304 shunters over a period 
of 5 years was .102. (iii) Arbous and Sichel (59) obtained .11 for a 
group of 318 steel workers over a period of one year. The above findings 
show. quite clearly that minor accidents do not enable us to make any 
predictions with regard to major accidents. This conclusion adds 
further weight to the argument that there are perhaps spurious factors 
associated with the minor accidents. 


(4) Correlation between different types of Accidents. It is very often be- 
lieved that ‘accident-proneness’ is a general factor manifesting itself in 
any situation in which the individual is placed. The “accident-prone 
Percy” of the Safety Films is a person who is constantly injuring himself 
at home, at work and on the public highways and the possession of this 
unfortunate trait involves him in all types of accidents. 

Confirmation for this conclusion has been sought by correlating (a) 
one type of accident with another in a given situation, and (b) home 
accidents with industrial accidents. The results have been equally dis- 
appointing: 

(i) Farmer and Chambers (47) give the following results for two groups 
of motor drivers. 
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Correlation Coefficients 
Group A Group B 
Errors of Judgment and over-runs .367 —|.021 
Errors of Judgment and skids .148 .049 
Errors of Judgment and miscellaneous ~ 234 —|.001 
Errors of Judgment and blameless .161 .045 
Over-runs and skids .077 .082 
Over-runs and miscellaneous . 233 .097 
Over-runs and blameless .149 .129 
Skids and miscellaneous 161 .074 
Skids and blameless .214 .060 
Miscellaneous and blameless .390 -075 


The most the authors could claim in this case was—“The felation is 
not significant in most individual cases but the tendency, though slight, 
is fairly consistent”. (ii) Brown and Ghiselli (49) give their results for 
59 trolley car motormen over a period of 18 months as follows: 


Intercorrelations 
1 2 3 4 5 
1. Collision with Pedestrians .10 -.1l a. .02 
2. Collision with trolley cars 22 .04 AG 
3. Collision, with motor vehicles .03) .07 
4. Board and Alighting accidents 19 
5. On-board accidents 


The correlation between Total Collision and Non-Collision accidents 
was .09. In the case of another group of 34 motor coach operators it 
was .19. In an earlier study a coefficient of .25 had been obtained. 
(iii) Newbold (45) found coefficients of correlation of the one of .2 to 
.3 between various types of accidents occurring at home and on the 
job. (iv) Adelstein (53) in two groups of 122 and 181 shunters over 
periods of 11 and 5 years obtained coefficients of the magnitude .186 
and .029. 

The above findings lead one to conclude that the concept of the 
“accident-prone Percy” is rather a figment of the imagination resulting 
from wishful thinking. Thev are moreover important from the point 
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of view of accident prevention, for they mean that an individual’s 
liability or proneness to accidents (if such a thing exists) in one set of 
circumstances will give little indication of his proneness in another. 

Research results have been quoted in some detail in this study with 
the specific purpose of exposing the exaggerated claims which are found 
in current accident prevention literature. From the above tables the 
student will be able to judge for himself. That this is necessary can be 
illustrated by quotation from a more recent article appearing in 1949 
(58). When referring to the incidence of accidents in successive periods, 
the authors state: 

“Our findings showed a similar tendency. By using the acciden‘ 
records for two four week (sic!) periods it was found that the 17 worker: 
who had the highest accident record during the first period had 2.4 
times as many accidents as a similar number of workers who had thx 
lowest record during the first period. This finding when considerec: 
with the frequency of personal accidents noted in this group woul: 
suggest that the accident tendency is a lifelong characteristic, and i: 
appears to invade all aspects of life. Those who have most accident: 
at work also have the greatest number of accidents away from work’’. 

And later when considering the relationship between minor an 
major accidents, and actually quoting Farmer and Chambers’ study o: 
14,000 workers the results of which are given in this text, they conclude 

“Yet from the point of view of prevention the major interest is in 
the factors concerned in the more serious injuries. Therefore it is 
important that we know something of the relationship between the 
frequency of major and minor accidents. In a British study of the 
accident records of 14,000 workers, it was shown that the group who 
have the most frequent minor accidents also have an undue proportion 
of major accidents.” 

The student will see for himself just how misleading this statement 
is, by reference to the actual results given above (see para. 3 (i).) 

In the final summary the authors state categorically: 


“1. The ‘aecident-prone’ worker can be identified by a simple analysi: 
of the frequency of occurrences of superficial injuries. 


2. This tendency towards accidents appears to be a stable charicter- 
istic. 

3. Those who have the most frequent minor accidents have a dispro 
portionate number of major accidents”’. 


It is to be lamented that statements of this type should be allowe 
to acquire the mantle of authority by being accepted for publics tices 
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in journals of repute. And yet it is not surprising that 
happen for it is an unfortunate fact that those who hav 
work in attempting to unravel the influence of persons 
accident causation, have also done most to confuse our 
this regard. Only those, possessing a superficial knowledge of the 
standard literature on the subject, could so disregard the warnings of 
Greenwood, Newbold and Yule as to make such exaggerated claims on 
the basis of these research findings. These factors, coupled at times 
with the use of questionable statistical procedures have ted in the 
creation of a ‘popular belief’, which has been enshrined in the Emergency 
Report No. 3 of the Industrial Health Research Board (61). 


this should 
e done most 
al factors in 
thinking in 


The evidence so far available does not enable one to make cate- 
gorical statements in regard to accident-proneness, either one way or 
the other; and as long as we choose to deceive ourselves |that they do, 


just so long will we stagnate in our abysmal ignoran 
factors involved in the personal liability to accidents. 
mean that accident-proneness does not exist, but that 


of the real 
is does not 
far we have 


not succeeded in defining it, assessing its dimensions and constituent 
elements, nor evolved a technique for putting it to practi 
the straightforward approach (adopted by Arbous and 
the case of absenteeism) of working out the ‘operating c 
of a bivariate distribution would seem to be of little 
of the low correlation coefficients obtained. 

Greenwood (55) himself would appear to have arrive 
conclusion in the end when he states: 

“T conclude by expressing a hope not likely to be fulfilled—that 
Mr. Chambers and his colleagues should be given the opportunity of 
studying Air Force Accidents; it does seem to me possible that the hy- 
pothesis which does not seem to have much value in the ordinary industrial 
field might be of value here. It is probable that the high standard of 
selection eliminates the pathologically prone, so it might be easier to 
try out a different hypothesis, and perhaps of practical importance.” 

What is needed is obviously more fundamental research and perhaps 
a completely new way of looking at the old problem. 

One investigation in particular would appear to be a most profitable 
line of development for the future. 

What effect has restriction in the size of the sample studied on the 
magnitude of r between accidents in successive periods? The cases 
which are self-eliminated and cannot be included in correlation because 
of incompleteness of exposure, are usually those who have accidents, 


d at a similar 


and are precisely the ones we are interested in. A mathematical model 
should be built, and tested if possible, to allow for this factor. Without 
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this allowance, it is possible that r is estimated on a restricted range viz: 


those cases who happen to survive the respective exposure periods which 
may be long. 


IV. THE PREDICTION OF ‘ACCIDENT-PRONENESS’ BY OTHER MEANS 


Considerable advantages would be gained if techniques could be 

discovered which would predict the accident-proneness of individuals (if 
such a phenomenon exists) in a given situation, before they had actually 
entered that situation and incurred accidents. These are clearly stated 
by Farmer and Chambers in their successive attempts to do so (32, 48, 
47 and 60). The authors are also careful to point out that: 
- “Care must be taken not to make accident incidence per se a measure 
of accident proneness, for this is to adopt the position of those who 
say that accidents are due to carelessness and when asked to define 
carelessness, do so in such a way as to leave little doubt that by care- 
lessness they mean having an undue number of accidents. ‘Accident 
proneness’ implies the possession of those qualities which have been 
found from independent research to lead to an undue number of acci- 
dents. If the term is used in this way a person can be said to be accident- 
prone without any knowledge of the number of accidents he has sus- 
tained, for this statement will merely mean that he is more likely than 
others in equal conditions of exposure to sustain accidents. Such a 
knowledge would make it possible to warn certain people against enter- 
ing dangerous occupations, so that although they were accident-prone 
in a relatively high degree, they might go through life with very few 
accidents.” 

Be this as it may, the fact remains that the discovery of these 
‘qualities’ can only be achieved by measuring the validity with which 
certain tests can predict some criterion of accident-proneness in a given 
group. This criterion can only be the number of accidents sustained 
by the respective individuals in that group. Moreover, whether or not 
‘accident-proneness’ really does exist, we should be very happy if we 
could predict the number of accidents a person is likely to have. Hence, 
a good fit with some theoretical bivariate distribution is of considerable 
importance quite apart from what the parameters happen to signify. 
And this is the obvious line of approach which the investigators take. 
A further point must be made in this connection. Prediction only be- 
comes feasible when the criterion of the attribute to be predicted is 
itself reliable and stable. If an individual’s accident record in the same 
set of circumstances is an unstable reflection of his ‘inherent accident- 
proneness’, then prediction of proneness by this means must be a shaky 
affair. The correlations between the incidence of accident shown earlier 
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in this study indicate quite clearly that there is very little consistency 
in the records of individuals from period to period. It is not surprising 
therefore, that all attempts to predict accident-proneness using accident 
records as a criterion or measure of this, have so far failed rather la- 
mentably. The investigators in 1933 appear to have been fully aware 
of these difficulties for they state: (48). ‘Unless individual suscepti- 
bility to accidents is a stable quality manifesting itself dw ‘ g all periods 
of exposure, we cannot expect a very definite relation between psycho- 
logical tests and accident rate (p. 7)’’, and later in the same report (p. 11), 
“Attention has been called to its smallness (i.e. correlation) since it 
shows that accident susceptibility is not a very stable factor within 
these groups; hence it cannot be expected that a in any test 
~urporting to measure susceptibility will have a very close relation with 
corded accidents.” 

What is surprising, therefore, is that the editietlin should have 
persisted in the face of these insurmountable difficulties, before estab- 
lishing a more stable criterion of proneness. 

Having been unsuccessful in their first and second attempts the 
investigators proceed to a third (48) and a fourth (47) ‘and the reader 
is amazed to find the latter report opening with what can only be re- 
garded as a statement of creed, in view of the absence of any inter- 
vening research findings which could alter the position as stated by 
them above. This is reflected here for comparison with the quotation 
already given: ‘Previous statistical investigations hav _ shown that in- 
dustrial workers exposed to equal risks were unequal in their liability 
to sustain accidents, and that this unequal liability was a relatively 
stable phenomenon, manifesting itself in different periods of exposure 
and in different kinds of accidents”’. | 

Furthermore, the stability of their own criterion in this particular 
case (as reflected by correlations between accidents in successive 
periods) was only of the order of .176 to .328. 

It is not surprising, therefore, that despite the one or two refine- 
ments introduced in the constitution of groups tested, these latest at- 
tempts should have yielded only inconclusive results. Surely if the 
efficiency with which the criterion can predict itself is only of the order 
of .3, then it is a fond hope to expect that any test, which can only 
measure part of the criterion will have a predicting efficiency as great 

*, or greater than .3. 

Had the above research findings been more positive, they would 
have warranted fairly detailed consideration in this study. As it is, a 
summary of the main conclusions will suffice. The student who is 
interested in making further attempts on this problem, is however, 
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strongly advised to study the original works if only to acquaint himself 
with the type of difficulty to be encountered in this work, and the pit- 
falls to avoid. These will be largely concerned with: (a) the selection 
of appropriate experimental groups, and the control of such influences 
as age, experience and environmental factors; (b) the definition of 
events which are to be regarded as accidents, and the completeness and 


~ accuracy of these data; (c) the stability or consistency of accident 


records in successive periods as a measure or criterion of personal acci- 
dent-proneness; (d) the adequacy of existing instruments of statistical 
analysis, and the precautions which must be taken against the use of 
techniques which are suspect, to say the least, when applied to frequency 
distributions associated with accident data. The techniques usually 
applied in psychological research are, for the most part, dependent 
upon the normal distributions and correlation surfaces. Accidents on 
the other hand are a discrete variable and yield J-shaped or Negative 
Binomial Distributions. This fact alone alters the whole meaning and 
predicting efficiency of the ordinary product-moment correlation co- 
efficient. Above all it is essential to avoid the folly of applying statistical 
procedures, knowing at the same time that they cannot be justified on 
mathematical grounds. Self-deceit of this type has never produced any 
results worth having. 

The main conclusions of the research of Farmer and Chambers in 
this direction can be summarised as follows: 

(a) Nowhere in these studies was the stability of the criterion (as 
indicated by the correlation of accidents in successive periods) higher 
than .44. It would be correct to state that a range of .17 to .32 was 
more representative of the data. 

(b) The investigators used a heterogeneous collection of tests “which 
were classified rather as a matter of convenience than of anything 
else’. They appear to have been largely individual tests involving 
certain perceptual and psycho-motor skills, neuro-muscular co-ordina- 
tion (some of which were presumed to be related to temperamental 
instability) and finally some verbal tests of intelligence. 

The results showed that only in the case of the aesthetokinetic tests 
was there “‘a slight positive association between the functions involved 
in the tests and those involved in accident-proneness”. These tests 
were the dotting test, reaction time test, Pursuitmeter test and co- 
ordination test. 

(c) Being dissatisfied with the usual statistical procedures for the 
analysis of this type of data which yielded inconclusive results, the 
investigators adopted other methods which led them to conclude: 
“These alternative methods of examining the data show that significant 
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of their ranges, since there is a significant difference only between the 
accident rate of those who are very good, and those who are very bad 
at the tests’’. 

It is not clear, however, whether elaborate proced of individual 
testing are necessary to make this broad differentiation, or whether this 
could not be just as efficiently achieved by some more obvious and 
economical means. 

(d) It is claimed by the investigators that “whatever may be re- 
garded as the best way of measuring the association between the tests 
and accidents, the final proof of the value of tests must depend on 
whether their use lowers the accident rate’. 

These are illustrated by the following table to which a final column 
has been added by the present author: 


TABLE SHOWING THE MEAN PERCENTAGE ACCIDENT RA OVER THE WHOLE 
PERIOD OF EXPOSURE OF GROUPS SELECTED BY DIFFERE METHODS, TAKING 
THE ACCIDENT RATE OF THE WHOLE OBSERVED GROUP AS 100 PER CENT. 


% of Total 
Motor Group 
Drivers | Elimi- 
nated 


Ship- | Electric | Engine 
wrights | Fitters | Fitters 


Accident rate of whole group | 100 100 100 100 
1. Accident rate of those left 
after rejecting the high acci- 
dent subjects in first year 87 99 94 91 |11%—27% 
2. Accident rate of top 3 inter- 
quartile groups of the aesthe- 


tokinetic tests 80 88 90 93 25% 
3. Combination of methods 1 
and 2 75 88 79 87 33% 
4. Accident rate of top inter- 
quartile group only in the aes- 


thetokinetic tests ' 61 66 74 75% 


The investigators conclude advisedly that “such methods could be 
employed, provided there was a good labour supply, for an occupation 
in which it was particularly desirable to have workers with a low degree 
of accident-proneness on account of its dangerous /nature”’. 


377 
results are obtained only with those tests that yield significant product- os 
moment correlation coefficients’; and: ‘From the two methods of com- 
parison by means of inter-quartile groups, it is clear|that these tests — 
have prognostic value for accident-proneness only at the extreme ends i 
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They might also have considered the hardship which would be 
suffered by the many individuals eliminated who would not necessarily 
have many accidents because of the tenuous relationship. 

(e) The final conclusion can be stated in the investigator’s own 
words: “In the meantime it may be said that one of the factors involved 
in accident causation, though not necessarily the most important, has 
been isolated and roughly measured.” 

This factor must, however, be regarded as a descriptive term rather 
than a psycho-physiological factor. 

Viteles (64) quotes the findings of other investigators who have at- 
tempted to predict accident-proneness among motormen by the use of 


-tests of perseveration, oscillation, speed, accuracy, muscle control, eye- 


hand-foot-co-ordination, ocular balance, psychogalvanometer, pursuit- 
meter etc. Of the work of Slocombe and Brakeman (68) he stated: 
“Elaborate data on reliability and an extended ‘apologia’ on the sig- 
nificance of low coefficients of correlation between tests and criterion 
are presented by the authors, who conclude from the fact that 8 of 
the 43 high accident men made “poor” scores on the tests in com- 
parison with 2 of the 43 low accident men, that the tests are of value 
in diagnosing and establishing the relationship between accident-prone- 
ness and test performances.” The most that can be claimed by the 
other investigators (Weiss and Lauer (69)), is to enumerate certain 
tendencies which “seem established and at least indicate the direction 
of future research”’. 

It will be readily appreciated from the above findings that the pre- 
diction of accident-proneness by these methods is an extremely hazard- 
ous affair. This is not surprising, in view of the fact that the necessary 
pre-requisite to an adequate predicting device (viz.—a stable and re- 
liable criterion) is lacking, and no amount of subsequent statistical 
juggling will compensate for this deficiency. 

Greenwood’s (51) comments are pertinent in this regard: “He be- 
lieved in proneness, and thought Mr. Farmer and his colleagues had 
done admirable work, but that he had not yet found practical tests 
which correlated so highly with what our ancestors called temperament 
that a satisfactory elimination of potentially dangerous drivers, which 
did not inflict hardship on individuals, was possible. But when we 
remember the enormous increase of efficiency in tests of the cognitive 
side of human nature since the beginning of the century, it was surely 
not Utopian to expect equally great improvements in our measuring of 
the conative side of man. It is likely, therefore, that the approach 
adopted in aptitude testing which lays the emphasis on the use of 
standardised tests measuring various abilities, skills and aptitudes etc,, 
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will have to be abandoned, or, at least heavily supplemented by clinical 
procedures which will assess temperamental and nality character- 
istics which appear to be more significant.” ; 
It is appropriate therefore that we should consider the results o 
some of. these ventures, which, though inconclusive, are nevertheless 
encouraging (58), (62), (63) and (73). 
One example will be quoted in detail, as the techniques of the new 
approach are worth studying. A clinical approach to the study of flying 
accidents was adopted by Biesheuvel and White (62) during World 
War II. ‘Two groups of Air Force pilots were chosen for study. ‘“‘(a) 
An accident group consisting of 200 pilots under training, who had been 
involved in flying accidents at elementary and adyanced flying schools 
during 1941 to 19-4 inclusive, (b) A control group|of 400 men who had 
completed both elementary and advanced training jwith an accident free 
record. 
These groups were chosen in such a way as to jensure comparability 
in respect of the following factors. (a) Test procedures used at the 
Aptitude Test Section and Testers applying them. (b) Length of stay 
at ground instruction school. (c) The Flying Training School to which 
they were posted, and following on this, the instructors and types of 
aircraft in which they received instruction. (d) Age and experience 
factors were automatically equated as a result of acceptance for Air 
Force Training. 
On each pupil pilot the Aptitude Test Section obtained: (a) An 
assessment of intelligence by means of pencil and paper tests. (b) Meas- 
urement of flying skill from such tests as: Two-Hand Co-ordination, 
Steadiness of Movement, Choice Reaction Time, Mechanical Aptitude. 
(c) A personality assessment, including suitability for service pilot 
duties. This assessment was based on observational, biographical in- 
ventory and interview data. Among the most important constituents 
were assessments of three constitutional temperamental variables: 
Secondary Iunction—measuring tempo, variability, stimulability and 
impulsiveness of behaviour. Activity—measuring drive and persistence 
in the face of obstacles. Emotionality—measuring the degree to which 
feeling is a determining factor in behaviour’. 


Great care was taken by the investigators to define the type of 
accident data to be included in the study. The following conditions 
were made: ‘‘(a) All accidents occurred when the pupil was flying solo, 
as it is sometimes difficult to determine who is responsible when flying 
dual. (b) All accidents due to mechanical defects in the engine or air- 
frame, in which courts of enquiry subsequently revealed that the pupil 
could not be held responsible, were excluded. (c) All accidents in which 
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the pupil played a completely passive role (i.e. when occupying a sta- 
tionary aircraft hit by another coming in to land) were also excluded.” 

As the result of statistical analysis, it was found that in terms of 
the extent to which individual scores or assessments deviated signifi- 
cantly from the mean of an unselected pupil pilot population, it was 
possible to assign Accident and Safety “indicators” to each pilot in 
respect of those tests which were found to have discriminating value. 
These were: (i) Two-hand co-ordination—speed. (ii) Two-hand co- 
ordination—quality. (iii) Mechanical aptitude. (iv) Secondary func- 
tion assessment. (v) Emotionality. (vi) Sporting background. (vii) 
Parental attitude to subject. (viii) Death of parent. 

When Safety and Accident indicators were assigned to each pilot 
in both groups in respect of the above tests, the following results were 
obtained: 


Accident Safety Total No. 
Indicators Indicators of 
Indicators 
No. % No. Y 
Accident Group 200 
Cases 234 61.4 147 38.6 381 
Contro] Group 400 
Cases 414 44.6 515 55.4 929 
16.8 16.8 


The following conclusions were drawn: ‘The difference in percentage 
between the accident and control groups is 5.64 times its standard error, 
which means that the probability is more than 99.99 in 100 that this 
difference is not due to chance. 

In practice, however, we have to deal with individuals, and not with 
groups, and the question still remains how to interpret a pilot’s score 
as indicative either of accident-proneness or of safety. 

We found that if we scored each accident indicator —1 and each 
safety indicator +1, the pilots in the two groups distributed themselves 
on a scale +4 at one end (4 more safety indicators than accident indi- 
cators), and —4 at the other end (4 more accident indicators than safety 
indicators). By trial and error we discovered that a score of +1 or of 
—1 could be treated as critical for 64.7% of the pilots of the accident 
group had scores of —1 or less, and 67.3% of the pilots in the control 
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group had scores of +1 or more. If we had applied this criterion before 
the pupils concerned‘ in this investigation had started flying, we would 
have been correct in 66.5% of our predictions, as the likelihood that . 
they would cause an accident. This result is remarkable for the fact 
that the accident cases in this investigation consisted of pilots who for 
the most part (77%) had only one accident during their training course. 
The majority could not be described as accident-prone. 

In order to check the validity of this method a group of 17 qualified 
pilots at a Fighter Operational Training Unit, who had accidents for 
which they were held responsible, was examined. The results are given 
below. 


| No. of Cases. 
Nett total of more than 1 accident indicator . . . 3 
Nett total of 1 accident indicator . ...... 10 
_ Neither accident nor safety indicators... . . 2 
Nett total of l safety indicator . ....... 1 
Nett total of more than 1 safety indicator . .. . 1 
17 


Thirteen out of the 17 cases or 76.5% would, therefore, have been 
predicted. Two cases or 11.8% would have beea wrongly predicted.” 

Of equal interest, however, are the recommendations of the investi- 
gators as to how the above findings should be applied in practice as 
part of an accident prevention programme. It is not suggested that a 
candidate for flying training, scoring —1 or —2, should be summarily 
rejected. Candidates with these scores should merely be regarded as 
potential accident casualties whose performance in the training schools 
should be closely watched so that measures could be taken immediately 
if any weaknesses became apparent. Furthermore, the final decision as 
to whether to ground a pupil should be taken on a clinical and not purely 
statistical basis, in the light of all the available facts. It was only 
considered prudent to reject candidates with scores of —3 and —4. 

Although the studies referred to above are particularly useful in 
helping us to re-orientate our approach to the study of the personal 
factor in accident causation, there are nevertheless certain limitations 
which the investigator should bear in mind. Analyses of this type 
always involve prediction on ‘a posteriori’ grounds—one estimates the 
efficiency of procedures in predicting those individuals who are already 
known to have sustained accidents. While this may well lead to the 
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standardization of useful techniques, their fina] validation must always 
depend on the acid-test of either a follow-up study on the two groups, 
or the reapplication of the techniques to a completely new and unselect« 
group where predictions are made before all the individuals enter the 
occupational field in question and these are correlated with the subse- 
quent accident records. 

Over and above these difficulties‘ hangs a “sword of Damocles” in 
the form of the inconsistency of the criterion to be predicted and which 
has already been stressed in earlier sections of this study. Owing to 
this factor it is conceivable that the validity of indicators already es- 
tablished may be seriously upset. It is feared that the investigator will 


‘always come up against this statistical stumbling block. If the incidence 


of accidents in successive exposure periods is as unstable as previous 
research findings have shown it to be, then surely, one can hope for 
little efficiency in one’s predicting techniques. There is of course, the 
final hope that if more attention is given to the definition of accidents 
to be included in the study that this deficiency may be overcome. 
The unreliability of the criterion may in some measure be due to the 
fact that accident data are not uniform, and that by merely including 
for study all accidents (from existing records) irrespective of causes or 
the manner in which they occurred, one is in effect collecting a ‘hotch- 
potch’ of events which are by no means homogeneous, or representative 
of the phenomenon of proneness. The result may be similar to that =f 
mixing measures of height and weight, or scores from test A and test B 
in the same distribution. 

This point might well be elaborated for the consideration ,of future 
research workers. It is felt that in most previous research scant at- 
tention has been paid to the definition of what constitutes an accident. 
It is appropriate that the following questions should be asked and 
answered: (1) What are the phenomena that are being studied? What 
is our variable? What constitutes our total statistical population of 
events? (2) If it is not possible to record the total population of events 
in any given set of circumstances, what sample can we hope to study? 
(3) How is this sample selected, and is it representative of the whole? 

An accident might justifiably be defined in the following manner: 
“In a chain of events, each of which is planned or controlled, there 
occurs an unplanned event, which, being the result of some non-ad- 
justive act on the part of the individual (variously caused), may or 
may not result in injury. This is an accident”’. 

There are certain aspects of this definition which need emphasizing. 
Firstly it is the occurrence of the unplanned or unpredicted event which 
constitutes the accident, Secondly, this event is due to some non- 
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adjustive act on the part of the individual concerned. Thirdly, the re- 
sulting injury is a consequence of this unplanned event, and does not 
itself constitute the accident—it follows afterwards. From this it fol- 
lows that our statistical population of events is the sum total of all 
unplanned incidents in the given environmental circumstances and in 
the given population of individuals. By definition also, these will 
exclude all events which can definitely be attributed to the influence of 
mechanical or impersonal causes e.g. the case of an omnibus driver being 
stung by a hornet, inherent defects in the machine etc. This naturally 
implies a strict investigation of all accidents in order to establish the 
factor of personal causation. This will also serve the purpose of debit- 
ing the correct individual for the responsibility of the accident. Thus 
if A drops a brick and it falls on B’s head, the former has the ‘“un- 
planned event’’, and the latter the injury; who has the accident? The 
answer surely is A and not B who is usually debitted with this in the 
accident register. 
' By definition all the following must be regarded as constituting our 
population of accidents: (1) All errors or slips or “near accidents” 
which result in no injury; (2) all unplanned events resulting in minor 
injury; (3) all unplanned events resulting in major or lost time injuries 
which incapacitate; (4) all unplanned events resulting in death. 
We might consider to what extent these data can be made available. 


(i) Errors, Slips and Near Accidents. Here we can be almost sure that 
data in respect of these would be incomplete, even with the most 
rigorous supervision. It might be profitable, however, to set up ex- 
perimental'y controlled conditions in a laboratory, equating the en- 
vironmental circumstances for all subjects and undertake the study of 
errors in some series of test situations. This might well throw some 
light on the proneness of individuals to this type of ‘accident’, which 
could perhaps be related to accidents in the industrial situation. 


(ii) Accidents Producing Minor Injuries. Only under highly controlled 
conditions could one be sure of amassing data which were complete and 
accurate, and which did not merely reflect a tendency on the part of 
some to report these incidents. 


(iii) Accidents Producing Major Injuries. Here one could be reasonably 
sure of collecting all the data. The difficulty is, of course, that in- 
capacity usually means a subsequent period of non-exposure which may 
be of considerable length and may even result in the removal of the 
individual from the work situation. 


fiv) Accidents Resulting in Death. Here one is virtually certain of the 
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data, but these occurrences are inevitably followed by subsequent non- 
exposure. 

This self-elimination of many of our cases (and unfortunately they 
are the very ones whom one is interested in studying) has a serious 
effect upon subsequent statistical analysis. 

Thus under normal circumstances (i.e. excluding those subject to 
the highest degree of control and supervision) one can only be reasonably 
sure of collecting and including complete data on major accidents. This 
means that we get a selected sample of our total population of. “‘un- 
planned events” (or accidents) for study, and this sample is selected 
purely in terms of subsequent injury. Now if the resulting injury is 


‘determined by chance, this would mean that we should get a random or 


representative sample of our total population of events. This would be 
extremely fortunate, for any conclusions arrived at on the basis of this 
sample would then be applicable to the total population of events. If 
on the other hand the resulting injury were not chance-determined, but 
were in some way a function of the manner in which the unplanned 
event originally occurred then we should be left with a slected sample, 
and statistical results could no longer be applied to the parent population 
from which these events came. It might be justifiable to argue that in 
any case one is only interested in studying major accidents and not the 
minor which have trivial consequences. This may well be so but it 
must then be conceded that one is no longer studying accidents but 
merely the incidence of injury resulting from accidents, which may have 
nothing to do with personal liability. 

Under the circumstances where one is studying accidents causing 
major injury only, it would appear that there are two “liabilities” to 
be considered: (a) in the first place the liability of the individual to 
have an unplanned event or accident in a given environment, and (b) the 
liability of this event to result in subsequent injury in a given environ- 
ment—resulting in its being recorded. 

It is conceivable, therefore, that if injury were not chance deter- 
mined, the following results may be obtained where— 

e—represent unplanned events without injury, 
(e)—represent unplanned events with major injury, 
and the suffix indicates the type of accident. 

In the case of two individuals the results might well be: 

Individual A—e, , , és (er), , = 1 reported major 
accident 

Individual B—e, , (ez), (ez), = 3 reported major 
accidents. 
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In the records B would obviously be considered more accident-prone 
than A, but this would not be a reflection of his proneness to unplanned 
events (slips or near accidents), since B has only 4, whereas A has 8. 
It would rather be a reflection of the liability of such events as he 
had to result in injury. This may well be determined by factors beyond 
his control and the study of the (e)’s would give little indication of the 
individual’s susceptibility to (e)’s. 

. It is appreciated that the development of this theme of the ‘double 
liabilities’ tends to lead to a state of confusion where one feels that 
the problem slips one’s mental grasp. This is not because the argument 
itself is confusing, but rather because we have not yet devised tech- 
niques of analysis which will enable us to disentangle the skeins of a 
confused and intricate pattern of events. Furthermore, our attempts 
to oversimplify the accident-causing situation by seeking to subdivide 
it into “personal causes” and ‘environmental causes’ tends to lead us 
nowhere. Greenwood and Woods in their original report recognized that 
‘{ndividual susceptibility sheltered a motley host of motives and factors 
which will be very difficult indeed to separate and measure”. Surely 
the essence of accident causation is the rather intricate inter-relation- 
ship which exists between the individual and the environment and the 
influence of one cannot be appreciated without considering its inter- 
action with the other, and to attempt to separate the two is about as 
profitable as attempting to unravel the respective influences in the 
heredity vs. environment controversy. Full comprehension of these 
inter-relationships and their statement in universal ‘laws’ will possibly 
only be achieved when we have evolved statistical techniques which 
will facilitate three- and perhaps four-dimensional thinking in these 
matters. Existing methods of analysis tend to make the problem too 
simple—too dependent on the direct cause-and-effect relationship. 

It is conceivable that with our present instruments of analysis we 
may yet succeed in pinning down the personal factor in accident-causa- 
tion (or accident-proneness) in terms of some strict definition—but this 
may involve so many restrictions in the mathematical sense that the 
concept will bear little relation to the real factors in the everyday 
industrial situation. The above comments do not constitute a denial of 
the usefulness of statistical procedures in investigations of this type, nor 
is it suggested that the research worker can ever dispense with them 
altogether. The conclusions indicate rather the limitations of these 
techniques, and emphasise the fact that statistics can never do your 
thinking for you. 

The above considerations may well lead one into a state of despond- 
ency and frustrated resignation were it not for the fact that a different 
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approach to the problem seems possible. This approach may not 
enable us to analyse accident liability in terms of universal principles, 
laws of distribution etc., with reference to human behaviour in general, 
but it may well enable us to reduce the incidence of accidents in our 
community by enhancing our knowledge of the mechanisms at work in 
causing individuals to have accidents. And surely this is the ultimate 
objective of all this work. This approach has been clearly explained 
by Viteles (64) in his reference to the accident reduction programmes 
of the Cleveland Railways Company (65). Viteles’ introductory com- 
ments to these studies are worth considering. 

“The psychological study of accidents in the manufacturing industry 
has been largely confined to a statistical study of factors influencing 
accident susceptibility. Such statistical studies are of questionable 
significance in arriving at a knowledge of the causes of accidents... . 
However, they suffer from serious limitations as practical aids in the 
reduction of accidents.” 

“In the first place the statistical approach is oriented from the 
viewpoint of discovering relations existing in a group of individuals, 
and not from the point of view of the adjustment of the single individual 
who has become involved in, or is susceptible to accidents. ... The 
function of the statistical approach and of statistical investigations in 
preventing accidents attributable to the human factor may be de- 
scribed as that of investigatory group tendencies. In contrast with 
this is the clinical approach—the functions of which are to determine 
the relationship existing among a number of factors which have played 
or may play a part in the case of the individual who becomes involved 
in accidents and to develop a programme for the prevention of addi- 
tional accidents on his part.”’ 

“Another limitation of the statistical viewpoint in accident preven- 
tion is its emphasis upon isolated aspects of individual personality, in 
contrast with the concern, in the clinical approach, for the total per- 
sonality of the accident-prone individual. ... It is undoubtedly true 
that a detailed examination of each stone tells much about the structure 
of a mosaic, but the contribution of each to the value of the whole flows 
from the integration of the various parts, and can only be fully deter- 
mined through an examination of the whole, and of the inter-relation- 
ships among the parts in the whole”’. 

“The aim of the clinical approach is to examine the whole individual, 
and from an examination of the whole to arrive at a knowledge of the 
significance of the various aspects of his personality—the relative im- 
portance of each sector of his personality in a given situation. The 
application of the clinical approach in the analysis of accident causes 
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involves a complete study of the individual involved in accidents—it 
makes the individual the point of departure, and provides for a thorough 
examination of every factor—physical, mental, social, and economic, 
and of those extraneous to the individual—which may have played a 
’ part in the accident in which he has been involved.” 

“The diagnosis of accident-proneness is being followed by specialised 
treatment, and treatment based on an exact knowledge of the factors 
which are responsible for the accident record—in the case of the partic- 
ular individual. Treatment takes the form, not of mass education, or 
the more drastic measure of termination but most frequently that of 
systematic instruction designed to efface faulty habits . . . medical 
treatment, discipline, encouragement and supervisory follow-up... . 
It recognises that there are many different causes of accidents and that 
they may combine in different patterns in different individuals.” 

“The knowledge of the factors which play a part in the case of a 
single individual is obtained by an experimental study of the individual. 
This includes psychological examination, close observation of operation 
details, a review of his relationship with supervising officers and fellow- 
workers, and possibly a detailed study of the home circumstances.” 

“The case study of each accident-prone motorman in the Cleveland 
Railway Company involved: (1) A careful examination under normal 
operating conditions of: (a) General Operation, (b) Motoring habits, 
(c) Mental factors, (d) Physical factors. (2) Analysis of previous and 
current years accident record. (3) Personal Interview. (4) Decision 
as to primary causes of accident-proneness. (5) Preparation of report 
of case recommending treatment based upon findings. (6) Treatment 
and follow-up.” 

“The most significant finding of the Cleveland study is that in no 
two cases were the causes of accident-proneness exactly similar. In 
most instances, several causes existed, although in each case, one of 
these was found to be of primary importance. The percentage distribu- 
tion of primary causes of accident-proneness among the 50 men is given 
in the following table. 

As a result of this application of the clinical method in the study 
and treatment of motor drivers, the combined rate of accidents on the 
part of the motormen involved in the study dropped from 1.31 per 
1,000 miles in 1928 to .75 in 1929, equivalent to a reduction of 42.7%”. 

The accident records of accident-prone motormen before and follow- 
ing study are shown in Figure 1. 

Viteles then quotes the experiences of the accident clinics run on 
similar lines at the Milwaukee Railways and Light Coy. (under the 
direction of Dr. Bingham (67)), and illustrates the value of these tech- 
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Faulty judgment of speedanddistance. ........ 12 
Irresponsibility 
Failure to keep attention constant . 
Nervousness and fear. . . 
Defective vision 

Organic disease . 

Slow reaction 

High blood pressure 

Senility . 

Worry and depeession . 
Fatigability . . 

Improper distribution of attention 
Inexperience . . 

Miscellaneous 
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niques by stating that: 

“Asa result... the actual nett savings in cost of injuries and damages 
in 1929 as compared with 1928 amounted to $300,670. Collision acci- 
dents on bus lines and street railways have been reduced more than 
35% since the work was started in 1927.... The figures for 1931, 
as compared with those for 1926, show a 58% reduction in collision of 
surface cars with trolleys; 54.4% with pedestrians; 71.3% with other 
surface cars; 36.8% in boarding and alighting accidents; and 23.6% in 
all other accidents. ... Associated with these economic returns are 
enormous social benefits in reducing the suffering and the general social 
maladjustment associated with personal injury resulting from acci- 
den 

Spectacular though these results are, their achievement should not 
pass without certain comment. (1) The application of these procedures 
can be an expensive business where one spends a considerable time on 
individual cases. In the case of railway accidents where the conse- 
quences are very severe on account of the large insurance and damages 
claims etc., the proposition becomes an economic one. It is doubtful, 
however, whether many industries could be persuaded to adopt similar 
measures where, owing to the lesser consequences of accidents, the 
economic gains would not necessarily show a profit in the financial 
statement, even though the social and indirect gains might be equally 
as great. (2) These results do not necessarily constitute a validation 
of the clinical procedures used nor of the diagnoses made in individual 
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REPRINTED FROM VITELES (64) PAGE 364 
FIGURE 1 


cases. This can only be achieved by their application to (a) an ex- 
perimental, and (b) a control group, as was done in the Hawthorne 
Experiment of the Western Electric Company. The point is, of course, 
that the results may be just as “screwy” as they were found to be in 
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this classic experiment viz. the accident rates in the control group 
(receiving no treatment) may also go down, not because of any measures 
being adopted in their own case, but principally because they were 
being applied in the case of another group. This might indicate that 
individuals were responding, not so much to the “shrewdness” of diag- 
nosis, as to the new interest being taken in them by management—to 
uhe new psychological atmosphere prevailing in the works. It is con- 
ceivable that the diagnoses may not have been directly responsible for 
the results. These may have been due primarily to management’s 
interest in the situation. Employees were taken into management’s 
confidence. Their problems were discussed with them in a sympathetic 
manner which enabled them to understand them better; this probably 
created a new attitude of mind on the part of the workers to the problem 
of accidents, and encouraged them to think of safety in new terms, and 
to regard their work habits in a new light. They were thus able to see 
the significance of causes and relationships which had probably never 
occurred to them before, and to heed these in the everyday work situa- 
tion. It is possible that this new mental approach pervaded the whole 
work group as the result of clinical studies, and may well have been 
the more general and underlying factor responsible for the spectacular 
results achieved. The natural rejoinder to these comments is: ‘Does it 
matter, as long as the results are forthcoming, and the whole business 
is a paying proposition?” This point is readily conceded, but the truth 
of the matter is of fundamental interest to the investigator. It may 
well be that the accident clinicians on these railways were not really 
gaining so much knowledge concerning the direct relationship between 
personal factors and accidents, as learning what Roethlisberger has 
called ‘New Social Skills”. If this is so, then the development of these 
is of the utmost importance. It may be possible therefore, in the future 
to refine these techniques, along new lines, and make them more of an 
economic proposition to the average industrial plant. (3) The third 
point to be emphasised here is the fact that the term ‘‘accident-prone- 
ness” is now no longer used in the original sense. It is now defined on 
the basis of a clinical and not a statistical diagnosis, and, as such, is 
largely a term of convenience, rather than of precise mathematical 
definition. This loss of precision is perhaps more than compensated for 
by the advantages of a new approach ‘which places special emphasis 
on the individual . . . and recognises the great variety of individual 
differences. In dealing with each motor-man, or each truck driver, or 
each automobilist, he is recognised not as one of the mass, but as a 
distinct personality, unique”. (Bingham (67); quoted by Viteles (64) 
p. 385). 


| 
| 
| 
a4 
§ 
> 
| 
4 
| 
| 
t 


ACCIDENT STATISTICS 


PART II: THE MATHEMATICAL BACKGROUND 
INTRODUCTION 


In Part I the- problem of accident causation was discussed, and a 
critical examination made of our existing knowledge of the concept of 
accident proneness. To interest as wide a circle of readers as possible, 
the use of mathematical formulae and terminology was avoided. 

However, the fact remains that such knowledge as we possess about 
accidents has its basis in statistical theory, hence it is appropriate that 
this should be clearly stated for the benefit of those who might wish 
to study the problem of accident causation still further. 

The writers have found that previous mathematical theory on this 
subject is scattered far and wide in independent publications extending 
over the period 1919-1950. It has been a laborious task to plough 
through the literature, put the pieces of the puzzle together and con- 
struct a meaningful picture in proper perspective. We hope that what 
we have done will save others from having to follow the same devious 
route. 

It seems to us that the earlier writers had a very clear conception 
of the problem, but that their mathematical treatment of it was often 
clumsy and difficult to follow. Perhaps this explains why, as the 
literature on the subject grew, this clarity of thought has been lost and 
we have recently suffered from a spate of articles of a semi-technical 
nature which in our opinion are positively misleading. 

The objects of Part II are: 1. To give a uniform and simplified 
treatment of the previous theory on the subject. The physical sig- 
nificance of the theorems obtained will be carefully discussed, and so 
will the various pitfalls that areencountered. 2. To criticise those parts 
of the theory that appear to be sterile and expand certain ideas that 
suggest a more profitable line of attack. 

To profit from all this the reader must have a sound knowledge of 
the elements of mathematical statistics as given, for example, in 
Weatherburn (80). 

Certain basic theorems on contingent probabilities are used again 
and again, and a certain simplicity and uniformity of treatment obtained 
by lavish use of probability generating functions and factorial moment 
generating functions. The value of these elegant tools should be more 
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widely recognised. lor the convenience of the reader all these formulae 
are given in an appendix. 

Accident statistics refer to phenomena which occur from time to 
time. Clearly one should look for mathematical models of what happens 
in the theory of stochastic processes which deals with conditions that 
alter from moment to moment. 

Stochastic processes form by now an important and rapidly ex- 
panding branch of mathematical statistics. (See for instance the ad- 
mirable article by Feller (84) and the books by ‘Lundberg (78) and 
Arley (81).) 

In what follows only the elements of this topic will be needed, and 
these elements will be discussed de novo. On the other hand, the ideas 
contained in them will permeate the theory as developed here in a 
manner which is lacking in most of the earlier literature. (This powerful 
idea of a moment by moment analysis is found in the paper by Green- 
wood and Yule written in 1920 (44) but unfortunately their algebra 
was somewhat intractable and the idea did not catch on). 


SECTION I: THE PURE CHANCE DISTRIBUTION OF ACCIDENTS 


Choosing the same starting point as Greenwood and Woods (43) did 
in 1919, consider a population of people working in a given environment 
and study what would happen to them within a given period of time 
if accidents occurred by pure chance.* 

Suppose that the population is kept under observation from time 
t, to time #, and that 2) is the number of accidents that an individual 
meets with during this time interval df = t, — t. 

Then 2, is a statistical variable with a law of distribution 


Po(%) say (1.1) 
That is, the population will split up into groups consisting of— 
a proportion po(0) who have 0 accidents 
a proportion 7o(1) who have 1 accident 
and so on (1.2) 


Now suppose that the population is kept under observation for a 
second interval of time ét, = t, — ¢, and let z, be the number of acci- 
dents that an individual meets with during the interval ét, . Then z, 
is a statistical variable with a law of distribution 


*Population” is used in the statistical sense of an indefinitely large aggregate of individuals from 
which we can only observe random samples, and the properties of ‘pure chance” are defined a little 
further on. 
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say (1.3) 


(In general, the mathematical form of p, will be different from that of p,). 

During the second interval, then, the population as a whole will 
follow the law of distribution p,(z,). But if accidents are due to “pure 
chance” it would seem that each of the groups accidentally formed 
during the first interval should follow the same law of distribution during 
the second interval. By chance all members of one group had, say, two 
accidents each during 6f) and all members of another group had, say, 
four accidents each; but this was “by chance” and does not reflect any 
difference between the physical and mental qualities of members of the 
one group and members of the other group; nor does it reflect any 
difference between the environment of the one group and the environ- 
ment of the other. So during the next time interval 5¢, we expect both 
groups to show the same distribution of accidents. 

In more detail, we mean that we expect that proportion of the 
population, po(2), who had two accidents each during 6f, to split up 
into subproportions 


Po(2)p:(0) who have 0 accidents during 
Po(2)pi(1) who have 1 accident during 
and so on (1.4) 


and similarly we expect that proportion of the population, p.(4), who 
had four accidents each during éf) to split up into subproportions— 


Po(4)p:(0) who have 0 accidents during 5¢, 
po(4)p:(1) who have 1 accident during 6¢, 
and so on (1.5) 
In general, then, we expect that 
P(Xo X1) = (1.6) 


where p(x» , 2;) is the probability of an individual having 2 accidents 
in the first time interval and z, accidents in the second.* 

In the above, 5¢, and é¢, were two adjacent time intervals. Clearly, 
as long as pure chance conditions hold, the argument can be repeated 
for any two non-overlapping time intervals. 

Note further, that within any very small time interval we expect 
that a small proportion of the population will meet with one accident 


*That is, while p(zo , 21) = po(xo)pi(z1 | zo) under all circumstances, the assumption is made here 
that pi(zi | zo) = pi(z1) and forms a mathematical definition of the properties of ‘pure chance”, 
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each and a very much smaller proportion will meet with two or more 
accidents eaeh. 
Finally, it is. ¢oncetvable that the population may behave as de- 


-scribed above from time ¢t =-0 to time ¢ = T’, but that this will not be 


true after = T or before ¢ = 0. 

Much of the above is phrased rather loosely, but it suggests the 
following mathematicaF model, in which vague phrases are replaced by 
symbols with sharply defined properties: 

Divide the time interval ¢ = 0 to ¢ = T into k subintervals (, to ¢, , 
t, to t, ete. 


where i= 0t,=T and 
= tar —h, 7=Otok-1 (1.7) 


Let zx; = number of accidents that an individual has during the 
interval ét; , and assume that z, has a distribution law p,(z;) (1.8) 


where pO) = 1 — — o(8t,) 
= f(t) + o(8t,) 
= > 1 (1.9) 


In addition, assume that 


P(xy Po( Lo) Pe-1(Le-1) for k > 0. (1.10) 


When (1.9) and (1.10) hold, the population will be said to be “homo- 
geneous within the time interval 0 < t < T”. 
Now let z = number of accidents per individual during interval 0 to T 


(1.11) 
Then, (see Appendix) 
M[z, ,u] = 1+ u f(t,) 6t; + o(8t,) (1.12) 
log M[z, , u] = u f(t.) — o(8t,) (1.13) 
whence 
log M[zx, u] = (u f(t,) 5t; — o(dt,)) 1.14) 


Tet k while the largest 0 
Then 
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log = u say (1.15) 
sO 
u] = 
and 
u) = (1.16) 
whence 
p(x) = z= 0,1, 2,--- (1.17) 


which is often termed the Poisson distribution with parameter X. 
Immediate corollaries are— 
(i) over the subinterval é¢; 
z; has a Poisson distribution with parameter 


tite 
f(b) dt (1.18) 
te 
(ii) over two non overlapping subintervals é¢; dt; 
where P(x; 1) = (1.19) 


x; and 2; have Poisson distributions with parameters A, and A; respec- 
tively. 


(iii) pi, = (1.20) 


where p;; is the coefficient of correlation for the bivariate 
population whose joint distribution is p(z; , z;). 


Equations (1.9) and (1.10) give a clear cut mathematical definition 
of a homogeneous population and (1.17) shows that a homogeneous 
population has a Poisson distribution. 

Has this model got the properties envisaged by the “chance dis- 
tribution of accidents” discussed rather wordily earlier on? Examine 
equations (1.9) and (1.10) again. 

(1.10) defines mathematically the fundamental property of “pure 
chance”: that on the whole what happens in one time period has no 
effect upon what happens in any other non-overlapping time period. It 
is an immediate extension of (1.6) with physical significance of which 
has already been discussed. 

(1.9) has two important characteristics—(a) it pictures an observed 
property of accidents: in a short enough interval of time 4¢ it is “very 
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rare” for an individual to have one accident, because p(1) is propor- 
tional to 6¢, and “‘practically impossible” for him to have two or more 
accidents, because the associated probabilities are proportional to higher 
powers of 5¢. (b) The probability of having an accident can vary from 
one time interval to another but this variation, which is measured by 
_ f(t), depends on time alone and not on what has happened in any previous 
interval. E.g. 

= f(t,)ét; + o(dt;) is a function of and ét; only and does 
not depend on 2;-,%;-2 etc. (If this is not true then (1.6) and (1.10) 
cannot hold). 

This extra idea that the probabilities can vary from time to time 
has interesting consequences. One can visualise accidents happening 
by pure chance and yet being to some extent controllable. One can 
imagine the whole environment being altered in a factory by the intro- 
duction of safety devices on the machinery; or that every employee is 
taught to be more careful by a “safety campaign”: accidents would 
still happen by “‘pure chance”, but there would be fewer accidents. 

Greenwood (43) also obtained the Poisson distribution (1.17) as his 
model of a “pure chance” distribution, but here it is obtained by a 
moment to moment analysis of the occurrence of accidents under certain 
circumstances, while Greenwood argued by analogy with balls thrown 
at pigeon holes. One result of this is that Greenwood’s \ was a constant, 
unassociated with time, while here \ = f¢ f(t) dt, which is a more 
general and flexible result. 

The above is theoretical: has the existence of a homogeneous popu- 
lation in nature ever been demonstrated? Surely not, for various 
reasons. One might agree that drawings made of balls from an urn 
constitute a statistical population governed by “‘pure chance” and even 
this has been queried, but in general nature is infinitely variable; human 
beings are never “exactly alike”, the environment they work in at any 
time is never quite the “‘same” for each individual. We simply do not 
. believe that a “pure chance” distribution of accidents exists in nature, 
though admittedly this statement cannot be rigidly proved. In turn, 
this is due to the fact that one can never observe a complete statistical 
population, but only samples drawn from it. Suppose such a sample 
refers to a time period 0 to 7 and that the observed distribution of 
accidents is “nearly Poisson’. The observed distribution is tested, say 
by the chi-square test, and a satisfactory “fit”? obtained. This is no 
proof that the sample must have come from a Poisson population: the 
test only shows that it may have done so. 

However, let us assume for the moment that the population has a 
Poisson distribution over the period 0 to 7; then in turn this is no 
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proof that the population is homogeneous in the sense defined in (1.9) 
and (1.10). If these two equations hold, then (1.17) follows and the 
distribution over the interval 0 to T is Poisson. But the converse 
theorem, that if (1.17) is true (1.9) and (1.10) follow has not and cannot | 
be proved: To prove homogeneity then it would be necessary to test 
(1.9) and (1.10) directly by splitting the time interval 0 to T up into 
subintervals, and proving that the distribution within each subinterval 
is Poisson, and that what happens in pairs of subintervals is uncor- 
related, and so on. Clearly one can never prove rigidly that a popula- 
tion is homogeneous. On the other hand, if in practice one split the 
complete time period up into two subintervals and found that, as far 
as one could tell, the distributions were Poisson for all three intervals 
and no correlation existed between the two subintervals then one would 
be willing to assert that the population was “nearly homogeneous”. 
Two main ideas emerge from the above— 


(a) The concept of the homogeneous population, which though it may 
never exist in nature will be a useful standard against which to 
compare what actually does occur. 


(b) The idea that it will be sensible, whenever feasible, to split any 
period of observation into two or more subperiods, and examine and > 
compare what goes on in each interval. 


SECTION II: THE MEASURE OF ACCIDENT PRONENESS, AND THE DISCUSSION OF 
A POPULAR FALLACY 

Greenwood’s original hypothesis of a “pure chance’’ distribution 
has been dealt with in the previous section, and the concept of a homo- 
geneous population arrived at, which will be used as an ideal standard. 
To deal with the sort of thing that happens in practice, Greenwood 
advanced two further hypotheses, one of which led later to the idea of 
accident proneness. 

In the literature on accident statistics, considerable use is made of 
the term accident proneness. It is a difficult matter to define what is 
meant by this term and to evolve a sensible measure of whatever it 
indicates. The reader is warned to examine carefully any meaning at- 
tributed to the term accident proneness in this monograph. It may 
differ somewhat from the meaning which he attaches to it. 

Apparently it is meant to indicate a personal trait in an individual 
as opposed to some characteristic of the environment in which he 
works. n individuals enter a “hazardous occupation”. All work under 
“similar conditions” and accidents occasionally occur; but it is said that 
Smith is more prone to have accidents than Jones. ‘More” suggests 
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that we expect to be able to measure or at least to rank this quality of 
“accident proneness’”’. Within a given period of time Smith has two 
accidents and Jones has five. Does this prove that Jones is more accident 
prone than Smith? Surely not. It might happen that within another 
similar period of time Smith would have three accidents and Jones only 
one. Accidents might be occurring by “pure chance”. Some writers 
appear to have overlooked this fact. See Mintz and Blum (42) for some 
trenchant remarks on this point, and the comments on p. 396. 

As a starting point, try to imagine a population of ‘equally accident 
prone” individuals who are all working in the “‘same environment”’. 
Surely here, if anywhere, we deal with a “pure chance” situation and 
following the arguments on p. 394 end up with equations (1.9) and (1.10) 
and the Poisson distribution (1.17). That is, a population of equally 
accident prone individuals, working in the same environment is a homo- 
geneous population. In a sense, this is the definition of “equally accident 
prone” used in this monograph. The reader, of course is free to agree 
or disagree with this definition. 

Having thus envisaged a population of equally accident prone people, 
how can this proneness be measured? The parameter A = {7% f(t) dt 
represents the average number of accidents per individual during the 
period 0 to T.* (2.1) 


So A/T = a, say is the average accident rate during 
this period 
= average number of accidents per individual 
per unit time (2.2) 


At first sight, a would appear to be a natural measure of proneness, 
but a little thought raises certain difficulties. 

First, a is an average measure referring to the time period 0 to T, 
and conditions may have varied widely within this period. If certain 
conditions alter from time to time the p,(z,) of equation (1.9) will 
alter; yet if these conditions affect all the individuals alike the popula- 
tion will remain homogeneous. It is quite reasonable to suppose that 
the introduction of more modern equipment might alter the environ- 
ment for the whole population: reduce f(/) and a, yet leave the popula- 
tion homogeneous. Again, in theory, it is possible that a safety first 
campaign might by explanation and education affect every individual 
in the same manner. Everybody remains equally accident prone 
although less prone to have accidents. Again, everybody “learns by 


*See any standard text such as Weatherburn (80). 


& 
ag 
= 2 > 
H i 
» 
tanks 
i 
a 
| 


ACCIDENT STATISTICS 399 


experience”, and it is possible mathematically if not psychologically for 
everybody to learn alike. The p,(z;) would alter from moment to 
moment while the population remained homogeneous. 

So, observations made in the manner envisaged (i.e. number of 
individuals who have 0 or 1 or 2... accidents within period 0 < ¢t < T) 
can only result in average measures a. . 

What is far more serious, from the point of view of measuring even 
_ average accident proneness is that the above arguments make it clear 
that what is observed in the manner suggested, is due to the inex- 
tricable mingling of the effects of environment with the effects of ‘“per- 
sonal attributes”. At best, a is only a measure of average accident 
proneness relative to a specific environment. Many writers stressing these 
facts would term a a measure of average accident liability in sharp 
distinction to any proposed measure of accident proneness. Put bluntly, 
no absolute measure of accident proneness exists. Thus the concept of 
accident proneness remains uncomfortably nebulous. But it must be 
noted that proneness as visualised (however vaguely) in this monograph 
can be altered by experience and education, for example, and need not 
be constant. No attempt will be made to analyse it into further entities 
such as “innate capabilities”, “effects of experience”, etc. Very little 
further reference to “proneness” will be made in what follows. 

To some readers this may seem to run contrary to certain trends 
in psychological literature, but the authors confess that they find the 
fashion of talking about nebulous entities which cannot be clearly 
defined or adequately measured both wearisome and sterile. 

Before the theory is developed further, a popular fallacy will be 
examined. 

Consider the observed distribution of accidents among 122 shunters 
during the six years 1937-42 shown at the bottom of Table I. — 

For observations of this nature, the following argument is often 
advanced. ‘Most of the accidents occurred to a few of the employees. 
If we discharge these men then in future the annual accident rate will 
be greatly reduced”’. 

If the population sampled is homogeneous, this statement is false. 
From the arguments on p. 394 it follows that the removal of people who 
had a specified number of accidents during one time interval has no 
effect on the distribution. of accidents during a subsequent time interval. 

But that is pure theory: see what could happen in a practical case 
by studying the data for both periods given in Table I. Here ro, = .258 
and other evidence also suggests that the data are non-homogeneous 
(see p. 407). 

In the first six years 122 men had 155 accidents, (an chenived rate 
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of .212 accidents per individual per annum). Of these, 17 men had 64 
accidents, or 14% of the men accounted for 41% of the accidents. The 
remaining 105 men had 91 accidents (an observed rate of .132 which is 
only 62% of .212). 

The suggestion is made that if the 17 men were discharged a similar 
reduction would occur in the accident rate for the second period. This 
implicitly assumes that all the 17 men will have “many” accidents 
during the next period and all the 105 men will have “few” accidents. 

On examining what actually happened during the next 5 years we 
see that all 122 men had 119 accidents {an observed rate of .195) while 
the 105 men had 92 accidents, (an observed rate of .175,) which is 90% 
of .195. Omitting the 17 men has resulted in a reduction of the rate, 
but this is not nearly as great as expected. The reason for this is easily 
seen from Table I and is of fundamental importance: (a) Many of the 
17 men who had three or more accidents during the first period had 
fewer than three during the second period (13/17 = 78%). (b) Some 
of the 105 men who had fewer than three accidents during the first 
period had three or more during the second period (8/105 = 8%). 

Thus, the assumptions so frequently made contain two errors, and 
it is impossible to assess the effects of these errors without some knowl- 
edge of p(z» , 

This has been clearly demonstrated by Arbous and Sichel (59) in a 
study of the cognate subject of absences, and stresses again the im- 
portance of observing accident statistics over éwo time intervals. 

Roughly speaking, it may be said that the larger po: is, the less 
the effect of these errors will be, but no accurate predictions can be 
made without a sound knowledge of p(z,, z,). Dismissal of men merely 
because they have a “large” number of accidents during one period of 
observation may well entail rank injustice towards most of these men. 
Admittedly the effect of the dismissals upon the remainder might be to 
make these more careful and markedly reduce their accident rate in 
future; but we must not claim that this admirable practical result has 
been achieved by the recognition and elimination of the accident prone! 


SECTION III: COMPOUND POISSON DISTRIBUTIONS 


In the previous sections Greenwood’s original hypothesis that acci- 
dents might be due to “pure chance” was analysed and the conclusion 
reached that if this were true the population would be homogeneous 
in the sense defined by equations (1.9) and (1.10), and would have the 
Poisson distribution given in equation (1.17). 

The idea of accident proneness was then examined and the con- 
clusion reached that a population of equally accident prone individuals, 
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working in the same environment would be homogeneous. Next, it was 
realised that even if these individuals were equally accident prone, no 
absolute measure existed of just how prone they all were to have acci- 
dents. At best, only a measure of accident proneness relative to a 
specific environment could be obtained. An obvious measure of this was 
a = h/T = average number of accidents per individual per unit time. 
of equation (2.2). It was decided to term a a measure of average 
accident liability to mark the fact that it was a relative and not an 
absolute measure of proneness. Thus the idea was reached that a homo- 
geneous population consists of individuals who are all equally liable to 
have accidents, but that by pure chance, some of them meet with more 
accidents than others, within any finite time interval. It was necessary 
to examine these ideas about proneness in some detail, because they 
occur frequently in the standard literature, but the suggestion is made 
that what is of prime importance is that the reader understand how a 
homogeneous population, as defined by (1.9) and (1.10) behaves, and 
remembers that the direct physical significance of the parameter. a is 
“average number of accidents per individual per unit time’”’. 

As mentioned earlier, the vast majority if not all of the populations 
met with when dealing with accident statistics are non-homogeneous. 
If a population is not homogeneous then how is it to be described? 
Greenwood in 1919 (43) suggested two hypotheses to meet this case. 
(1) Compound Poisson distributions, which will be discussed in this 
section. (2) Biassed distributions (frequently termed contagious dis- 
tributions) which will be dealt with in the next section. 

If a population is not homogeneous, then one possibility is that it 
consists of a mixture of two or more homogeneous populations—that it 
is a compound Poisson distribution. If a proportion p,(A,) of the 
mixed population comes from a homogeneous population with parameter 
\, and a proportion p.(A2) from a homogeneous population with para- 
meter \, , then the distribution of the mixture is given by 


p(x) = p,()e™ + p,(dr2)e where p, + p2 = 1 (8.1) 


More generally we might have a mixture of k homogeneous popula- 
tions so that 


Still more generally, we might assume that the \’s within the segues 
tion have a continuous distribution, so that individuals with ° a 
\ have a distribution 
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pt |) 


(3.3) 


while the distribution of the population as a whole is 


pa) = [ p(x | »)p0) a (3.4) 


Quite generally any mixture of homogeneous populations can be 
expressed in the form of a Stieltjes integral 


pa) = [ ple (3.5) 


where F(A) is the cumulative distribution of \ i.e. is the probability 
that this variable has a value ) or less. 

If the population consists of a mixture of homogenous populations, 
we want to know how it is built up. We want to know the values of 
X in the constituent populations, and what proportion of the whole 
population each constituent is. This information is provided by F(A) 
or dF (A) and, in theory, complete knowlege of the distribution of \ can 
be obtained from a knowledge of p(z) as follows: 

The f.m.g.f. of p(x) is 


> (1 + u)*p(z) 


M{z, u] 


(1 + u)* e dF() from (3.5) and (3.3) 


[ 


= M(,u), them.g.f. of (3.6) 
This is equivalent to saying that 
= (3.7) 


where y4;(x) is the k-th factorial moment of x 
and ui(A) is the k-th moment of \ about the origin. 

Thus, in theory the \ distribution can be obtained when the z dis- 
tribution is known. In particular 
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= 


Ho(A) = — mi(x) (3.8) 


gives the mean and variance of \ in terms of the mean and variance of z. 

In practice then, if it is reasonable to suppose that a mixture of 
Poisson distributions is present, we can, from equation (3.7) use esti- 
mates of the moments of the distribution of x to obtain estimates of 
the first three or four moments of the distribution of \ and hence, using 
standard methods, get some sort of picture of this distribution. A very 
attractive idea. 

This method was explained and employed lavishly by Newbold in 
1927 (46). The results suggested, for her data, that the \ distribution 
was a Pearson Type III, as originally suggested by Greenwood in 1919, 
(43). This results in x having a negative binomial distribution. 

Proof: Using Anscombe’s notation (82) assume— 


where 
(3.9) 
Then, from (3.7) 
M(A, u) =c (3.10) 
k/m_ \* 
= (, from (3.9) (3.11) 
or M[z, u] = (1 ~ my) from (3.6) (3.12) 
So 
G(z, u) = M[z, u — 1] 
(Ets my)" from (3.11) (3.13) 
whence 
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The phrase ‘“‘negative binomial distribution” is due to the form of 
G(z, u). 
From (3.12) it readily follows that 
ui(z) = m 


m 


= m(1 + 2) 
So that 
_ _, 


— 


Hence, estimates for m and k are 


m=2 
i 
= (3.16) 


where Z = z,/nand s” = >> (x, — Z)*/n 
The advantage of using k and m is that (Anscombe) for large samples 
of size n, m and k are practically independent with 


= (142) 


w(t) = +E) (3.17) 


(Maximum likelihood estimates of m and k have been given by 
Sichel (83).) 


As an example, apply the above to the data for the eleven year 
period given in Table I. Here m = 2.246 + .174 and k = 3.52 + 1.31. 
The standard errors show that m is “well determined” but that k is not. 
The distribution np(x) with these values of k and m is given in Table IV. 
A chi-square test for goodness of fit gives a satisfactory result (P = .60 
for 4 d.f.). By way of contrast, a Poisson distribution with the same 
mean gives a very poor fit (P < .001 for 4.d.f.). 

The evidence examined here tends to confirm the hypotheses made 
in (3.4) and (3.9). If these hypotheses are accepted, then the distri- 
bution of the individual “liabilities” is given by (3.9) which shows that 


_ 2k, has a chi-square distribution with 2k 
~ m” degrees of freedom (3.18) 


and the behaviour of \ can be studied. 
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‘For the example given then, » = 3.14\ would have a chi-square 
distribution with 7 degrees of freedom, very nearly. This would mean 
that while the average liability is 2.25 that, for instance, 5% of the 
population have a liability that is greater than 4.50 and 5% a liability 
that is less than 0.69. A more detailed picture can be obtained if 
required. 

On studying the literature, it appears that many writers accept 
Newbold’s work as proof that accident distributions are a mixture of 
‘Poisson distributions, and further, they think that the mixture must 
result in a negative binomial distribution. (They then lose the courage 
of their convictions and fail to make full use of equation (3.18).) 

Both ideas are nonsense, for two reasons, one mathematical the 
other practical. Mathematically, if (3.3) and (3.9) hold then (3.14) 
follows and the distribution is negative binomial. The converse theorem 
has not been proved and is not true. In the next section the same 
distribution will be obtained as the consequence of an entirely different 
set of hypotheses. Practically, when given an observed distribution, one 
can never prove that the underlying population distribution is exactly 
negative binomial, or exactly anything else. At the best, any hy- 
pothesis is merely “plausible’’. 

More generally, we can apply equation (3.7) to practically any p(z), 
where z is discrete and thus interpret it as a mixture of Poisson dis- 
tributions (provided that the resultant yu2,(A) are positive). But this 
is no proof that this p(x) is built up in this manner. It might happen 
that p(x) was built up in such a manner that this interpretation is 
mathematically possible, but physically most unlikely. Still more dras- 
tically, p(x) may be of a form for which y.(A) would be negative, which 
is physically impossible. (An example is given in the next section). 

In spite of all these difficulties, which are inherent in any attempt 
at inductive reasoning, the fact remains that when one has examined 
all the evidence available one gains the impression that one hypothesis 
is more plausible than the others. To defend this hypothesis is one 
thing—to accept it blindly is another. 


SECTION LV: CONTAGIOUS DISTRIBUTIONS 


If a population is not homogeneous, then one obvious possibility is 
that this is because one individual is more liable to have an accident 
than another, using liable in the sense explained on p. 399. This was 
Greenwood’s second hypothesis and some of its consequences have been 
elaborated in the previous section. 

Another equally obvious possibility is that the fact that a person 
has had an accident might affect the probability of his having another 
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accident. Distributions built up on this type of hypothesis are termeu 
“contagious distributions’. 

Historically, this was Greenwood’s third hypothesis. He termed the 
resultant distributions “‘biassed” distributions. Greenwood and Yule 
wrote a paper on these distributions in 1920 (44). Their algebra was 
intractable and their results did not fit their data particularly well, so 
the idea seems to have been abandoned in the further literature of 
accident statistics. Meanwhile, workers in other fields were developing 
hypotheses of this type under the name of “contagious’’ distributions. 
(See Feller (74) (84) for descriptions and further references). Some of 
their basic ideas will be developed here and applied to accident statistics.* 

Given a population of employees the fundamental assumption is 
made that by time 


t = 0 none of the individuals have had an accident (4.1) 


This will be true if we deal with a population who are just be- 
ginning a type of work that is new to them. 


Let p(z, t) be the probability that an individual has had 
x accidents by time ¢ (4.2) 


Assume that during the interval of time ¢ to ¢ + dt an individual can 
have (apart from infinitesimals of higher order) 


0 accidents, with probability 1 — f(z, t) dt 
or 1 accident with probability I(x, t) dt (4.3) 


This is essentially the same hypothesis as given in (1.9) with the 
addition that the probability that an individual has an accident within 
a very short interval of time depends on the number of accidents that 
he has had previously. They are contingent probablitities, and from 
(4.3) and the addition and multiplication principles (see Appendix), it 
follows that: 


pO, t+ dt) = pO, (1 — dé) 
pul, dt) = p(1, t)(1 — f(1, dt) + pO, dt 


*The term contagious distribution is widely accepted by now and for some applications this is a 
sensible terminology, but for accident statistics it is misleading. In an early application of a distribu- 
tion of this type (Lundberg (78),) z = number of smallpox cases observed in a given month. The 
statistical “individual” was the month. If one case of smallpox occurred within that month this 
increased the probability of more cases occurring within the same month by contagion among the 
human beings observed during that month. In this case the phrase ‘contagious distribution” was a 
sensible one. But when dealing with arcident statistics the statistical individual is the employee. 
The fact that he has had an accident may perhaps increase the probability of his having another 
accident. This might be called “‘self infection” but not contagion. Contagion surely suggests that 
the fact that one man has had an accident increases the probability of other men having accidents, 
while the “‘contagious” distributions discussed here are based on hypotheses that make no allowance 
for this possibility, when applied to accident statistics. 
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p(2, ¢ + dt) = p(2, — f(2, ) dt) + p(l, (4.4) 
etc. | 
or 


d pO, ) = t)p(0, t) dt 
p(1, = —f(1, dt + $0, Dp, dt 


d p(2, i) = —f(2, )p(2, t) dt + fl, )p(1, t) dt (4.5) 
etc. 
an infinite set of simple linear differential equations, whose successive 
solution gives in theory, and often fairly simply in practice, the suc- 
cessive probabilities required. ; 


Note that on multiplying the first equation by unity, the second by 
u, the third by u’ and so on and then summing, we obtain 


d G(x, u) = (u — 1)H(z, u) dt (4.6) 


where 


which is a single differential equation, the solution of which gives the 
p. g. f. for x. On occasion, this affords a quicker method for obtaining 
t). 

For instance, if f(z, 2) = f(t) is a function of ¢ only then (4.6) 
becomes - 


dG(z, u) = (u — 1)f()G(2, u) 
the solution of which is 
G(za, u) om 


where 


at (4.7) 


But this is equation (1.17). p(z, ¢) is Poisson, the population is 
homogeneous, and we merely have an alternative proof for the results 
of Section I, which now turns out to be a very special case of a much 
more general theory. . 

To what sort of data can one expect these more general theories to 
apply? It is an old saying “that the child who has burnt his fingers 
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fears the fire’’ which suggests that the beginner, after making one 
mistake becomes very much more careful. Note the two points, it is 
the beginner who can be expected to learn in this manner (you cannot 
teach an old dog new tricks) and one mistake has a profound effect 
upon the man. 

With the first point in mind, examine Table VI, in which figures 
are given concerning the accidents met with by a group of 552 shunters 
during their first year of employment (Adelstein (53)). The data have 
been divided into three age groups, as shown, and have the following 
peculiar property: In each case Z is greater than s°. This might be 
due to chance sampling effects, but accepting the results at their face 
value this would mean that y{(z) is greater than u,(r), and that these 
data cannot be described by a compound Poisson distribution because 
#2(\) would be negative (see equation 3.8), which is physically im- 
possible. 

With the second point in mind, assume, with Greenwood and Yule 
(44) that in equation (4.3) f(o, = and f(z, t) = eforz > 1 


where (4.8) 


This is of course a crassly oversimplified model of what may happen. 
It assumes that until a man has had his first accident the probability 
that he has an accident within a short interval of time dé is 6 dt, but 
that as soon as he has had his first accident this probability decreases 
to edi and thereafter remains constant no matter how many more 
accidents he may have. All “learning” takes place at the moment 
when he “burns his fingers”. No allowance is made for the effects of 
further accidents, and the probabilities defined in (4.8) are independent 
of time so that the possibilities mentioned on p. 396 which from time to 
time might affect the whole population, are not allowed for. 

However, all models are gross oversimplifications and it may turn 
out that this model proves to be a useful first approximation for certain 


types of data. 
When (4.8) holds, (4.5) becomes 
d p(0, t) = p(0, dt 
d = —ep(l, t) dt + 6 p(O, dt 
d p(2, t) = —e p(2, dt + ep(l, dt 
d p(3, t) = p(3, dt + p(2, t) dé (4.9) 
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whence (4.6) becomes 
d = — u) + (6 oplo, de 
which i is readily found to have the solution | 
= bu +e — Ile — + eu — (4.10) 


- This is useful for obtaining the moments of the distribution (by 
replacing u — 1 by u and expanding), but the simplest way of obtaining 
. p(x) seems to be by solving equations (4.9) in succession which readily 


gives 
p0) 
n(1) = = 1] 
etc. 
which for purposes of computation may be written 
p@) =e" 


etc. 


. Note that the results are rapidly built up by combinations of terms 
from the Poisson distributions which have parameters é¢ and et. 
This will be termed the “burnt fingers” distribution. 
From (4.10) it is found that 


= + (1- (4.18) 


= + 2) (2 1) 1(z) 
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Compare Greenwood and Yule (44) equations (39) and (40), after 
allowing for misprints in their equation (39). 
A simple way of estimating 6 and « (cf. Anscombe (82)) is to employ 


the obvious estimates for p(0) and y{(z) and hence obtain estimates 6 
and € from 


et = fn = observed proportion with no accidents (4.15) 


— (1 — f./n) 
6 — (1 — fo/n) 


Using standard approximations (Kendall (79)) and writing g(0) = 
1 — p(0), it will be found that 


= - (4.17) 


(4.16) 


, €q(0) 
c= += 5 (4.18) 


The form in which Greenwood and Yule evolved the “burnt fingers” 
distribution was difficult to handle arithmetically. In addition it did 
not fit their data particularly well. 

We now apply this theory to the data given in Table VI, divided 
into three age groups as shown. Z for the first group differs significantly 
(at the 5% level) from ¢ for the other two groups so that it seems 
pointless to combine all three groups. To each of the four groups 
shown a Poisson distribution and a “burnt fingers” distribution was 
fitted. 

As judged by the chi-square test, the Poisson distribution gives a 
rather unsatisfactory fit for the first three groups taken separately and 
a poor fit for the combined group. 

The “burnt fingers’ distribution is prectionlly identical with the 
Poisson distribution for the first group (ages 21 to 25), but gives a 
very much closer description of the data for the other two groups and 
these two groups combined. Since 6 and « have not been estimated by 
“maximum likelihood” we are not, strictly speaking, entitied to use P 
as a measure of goodness of fit. We suggest however that the im- 
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provement is so obvious, for two out of the three groups that the “burnt 
fingers’’ distribution merits attention. 

The “burnt fingers” distribution has been dealt with first, because 
it was the first (and so far the only) serious attempt to apply contagious 
distributions to accident statistics. 

Meanwhile, workers in other fields were developing the theory. For 
instance, a generalisation of the “burnt fingers” distribution is obtained 
by accepting equations (4.1), (4.2) and (4.3) and assuming that 

f (z, t) = £B, (4.19) 
where 8, is positive and in general different for each value of z. 

The general solution for p(x) in this case has been found inde- 
pendently by several authors (see Feller (84) for details) and it is 
found that: 

the p(x) are the successive terms obtained by expanding e™ in a 
Newton divided difference formula with respect to the arguments 
Bo ’ B, ’ a etc., and then putting B= 0 (4.20) 

If some of the 6’s are equal the appropriate limiting forms of the 
divided differences must be taken. This result is very neat in theory, 
but in practice it seems simpler to examine particular cases on their 
own merits as is done here, rather than derive them from the general 
formula. 

A particular case of (4.19) has been much used by workers in other 
fields. In this case the assumption is made that 


f(z, ) = B + yx where B and y are positive (4.21) 
In this case, (4.6) becomes 


u) = G(x, u) + yu GQ, »| (4.22) 
This is a simple partial differential eunetion with a solution 
G(x, u) = — — (4.23) 
which satisfies the required initial conditions G = 1 at ¢ = Oand u = 1. 
That is to say, x has a negative binomial distribution with 
_ T'(6/y + x) P 
p(z, t) =e z!T(6/y) (1 e ) (4.24) 


which is essentially of the same form as p(x) in equation (3.14) with 
k replaced by B/y and 


ea replaced by (1 —e"') (4.25) 
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If the same methods are used for fitting (3.14) and (4.24) to ob- 
served distributions (e.g. by moments as in Section III) identical numer- 
ical results follow. 

The ironical and disconcerting fact is thus met with, that two. very 
different hypotheses lead to the same distribution. 

Given an observed distribution which, as far as we can tell, comes 
from a population with a negative binomial distribution, this fact by 
itself gives no clue whatever as to whether the observations can be 
“explained” by the “compound Poisson” hypothesis, or the ‘“con- 
tagious” hypothesis (or by some further hypothesis as yet undefined). 

The physical significance of the compound Poisson hypothesis has 
been discussed in the previous section. 

Consider the physical significance of (4.1) and (4.21). At time t = 0 
no member of the population has had an accident, and each individual 
has the same probability 6 dt of having an accident within a small 
interval of time dt. But later on the individuals separate out, some 
have had no accidents up to that time, others one or two or more 
accidents. For these classes, the probability of having an accident 
within the next small interval of time dt differs, and the more accidents 
a man has had the more likely he is to have another accident. 

It is easy to criticise this hypothesis, and remark that it is well 
known that initially the probability of having an accident is not the 
same for each individual; that when a man has had an accident, it may 
tend to make him more careful in the future and so on, but the point 
stressed here is that each set of observations should be treated on its 
own merits and not by analogy with what other people have observed 
about other data. In the case discussed here, if the observations are 
arranged as a univariate distribution, it is impossible to decide whether 
the compound Poisson or the contagious hypothesis is the more plausible. 
Can we then, re-arrange the data in such a manner as to throw some 
light on the problem? In the next section this will be done by splitting 
the complete time period up into two subintervals and studying the 

- resultant bivariate distributions.* 


*In equation (4.21) f(z, t) = B +-yz where 8 andy are positive and each accident that an indi- 
vidual has makes him more likely to have another accident. 
If =B 
then (a) p(x) turns out to be an ordinary binomial distribution and (b) each accident that an individual 
has makes him less likely to have another accident. This looks promising for some types of data: 
the drawback is that for large enough z, f(z, t) becomes negative and cannot represent a probability. 
A more realistic development of the “burnt fingers” model would be to define f(z, t) as a monotonic 
decreasing function of z that remained >0. This suggestion will not be followed up here, becayse the 
main aim of this monograph is to discuss standard theory, 
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SECTION. V: BIVARIATE FREQUENCY DISTRIBUTIONS 


_ Successful scientific analysis consists of three stages, (i) the ex- 
amination of what has happened, (ii) the prediction of what will happen, 
and (iii) the “explanation” of what happens. 

Very often one observes in order to predict, and can predict suc- 
cessfully without worrying over much about explanations. This is the 
practical approach taken by Arbous and Sichel (59) in the field of 
absenteeism. Its application to accident data will be considered with 
particular reference to (A) the bivariate compound Poisson (negative 
binomial), and (B) bivariate contagious distributions respectively. In 
accident statistics the observed fact is that some people tend to have 
more accidents than other people, time and again. A fundamental 
problem is to be able to predict which people will have most accidents 
in the future. The advantages that would follow from being able to 
predict in this manner are manifold and evident. A sensible method of 
attacking the problem would appear to be to divide the period of ob- 
observation into two sub periods and compare what happens to an 
individual in the first period with what happens in the second period. 
Most of what follows deals with the above problem and the proposed 
method of attacking it. Neither problem nor method are new. 

Once the period of observation has been divided into sub-intervals 
and the data arranged as in Table I, one of the first steps taken will 
be to calculate the sample coefficient of correlation 7), which is an 
estimate of the population coefficient of correlation pp), associated 
with the bivariate population distribution p(x , 2,) of equation (1.6). 
As mentioned before, if, as far as we can tell, the distributions for the 
two subintervals and the complete period of observation are Poisson, 
and po; is zero, we may be willing to accept the hypothesis that the 
population is nearly homogeneous. What is more to the point in prac- 
tice, if ro, differs significantly from zero so that we are reasonably sure 
that po, > 0, this affords convincing proof of non-homogeneity. 

For people working in the same environment the argument seems 
simple: for various reasons, some individuals have few accidents in both 
time periods, while others have many: the greater this observed differ- 
ence between individuals, the larger ro, will be. If 79, is large enough 
po is almost certainly greater than zero, and the observed differences 
in individual behaviour cannot be explained by “‘chance’’. 

We thus cut right to the heart of the practical problem, and by 
simple means establish, beyond reasonable doubt, that some individuals 
persistently have more accidents than others. Thus, the population is 
non-homogeneous. Why is this and what are we going to do about it, 
are questions which have to be treated with due caution. 
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The point stressed is that this technique of dividing the whole 
period of observation into two or more sub-periods provides a simple 
and powerful method of analysing the observations and should be used 
whenever possible. 


(A): A Bivariate Compound Poisson Distribution. The idea has 
been stressed that valuable information can be obtained by dividing 
the period of observation 0 — T into two or more sub-intervals. It 
will now be applied to the data of Table I. (a) by fitting a simple 
bivariate compound Poisson distribution to them (in this sub-section), 
and (b) by fitting a particular bivariate contagious distribution to them 
(in the next sub-section). 

Take-two subintervals, 5, (the interval 0 to é) and 4, , (the interval 
t to T) and let 5 denote the complete interval 0 to T (5.1) 

As in Section I, let 


2%» = number of accidents an individual has during the interval 6, 
x, = number of accidents an individual has during the interval 6, 
x = number of accidents an individual has during the interval 6 - 


and assume, for each individual, that 


ply |r) = (5.2) 


Xo! 


play | = (5.3) 
These assumptions are less general than the one made in section 1. 
They contain the assumption that in equation (1.9) f(¢;) = A’ where 
\ is a fixed annual accident rate unaffected by time. Contrast this with 
the A, defined in (1.18) which depends on time, and the A of (3.18) which 
refers to a period of eleven years. 
. The simplified assumption is made because, as shown later, it appears 
reasonable in the case to which the theory is applied. 
Next, make Greenwood’s assumption that the individual \’s have 
a type III distribution and express this in the form 


ky 
p(n) = (5.4) 


(Compare with (3.9) a = fixed annual average number of accidents per 
individual and is less general than m). 
Note that 
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| = p(to | |) _ from (1.6) (5.5) 
and that 


Zi A) = | AYP) (5.6) 
Then the m.g.f. of this joint distribution is 


which readily reduces to 
—k 
(1 y) (5.7) 


Associated with the variables z) , x, and and the variable z = 
X» + 2, there exist various joint distributions and univariate distribu- 
tions which can be written down practically at sight from (5.7) and 
several contingent distributions which are easily obtained by dividing 
one distribution by another (see Appendix for fundamental theorems 
involved). 

Thus, 


aa)" (5.8) 


and similar results hold for x) and x, with 6) and 6, respectively replacing 
6 (compare equation (3.13).) 
Similarly 


-k 


Thus 2, x, and z all have negative binomial distributions, where 


(which is merely (3.14) with ad replacing m) and two similar results hold 
for x, and x, , which are unobtainable by the methods of Section III. 

From (5.9) a bivariate compound Poisson distribution is obtained 
with 


ka? T(k + 2) 80° 
(k + Tk) xo! 


= (5.11) 


Among the contingent probabilities perhaps the most interesting are 
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_ (k + ado)**** + 2) 


(k + a8)*** TK +2) 
| 

5,a 


p(r 2) = ont +8) 
together with two similar results for x, and z, . 


By expanding (5.7) and the m.g.f. obtained by replacing u by e* in 


(5.13) various moments can readily be obtained. 


For example 


so that 


(Newbold) (46) 


ui(z) = ad 
mQ) =a 
| to) = 


= ad (1 + 


a 
= 
_ + ba) 
cov (z, A) = 
cov (% , %) = 
ole, 0) = 
1+ 8 
and two similar results 


(5.12) 


(5.13) 


(5.14) 


(5.15) 


(5.16) 
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and 


1 


»%) = 
Lo, +4) (5.17) 
(Lundberg) (78) 


_ Some of these results will now be applied to the data given in Table I 
_ (Adelstein (53)). The data refer to accidents met with when at work 
by 122 shunters. The men had had up to twenty-five years experience 
before coming under observation and were observed for eleven years. 
This period has been divided into two sub periods of 0 to 6 years and 
6 to 11 years respectively. 

A Poisson distribution can be fitted quite plausibly to the observa- 
tions within each subinterval, but not to the observations over the 
whole period.* Further, 75, = .258 and differs significantly from zero. 
It would appear then that the population is non-homogeneous, and we 
proceed to fit a bivariate negative binomial distribution based on the 
compound Poisson hypothesis to the data, as given by (5.11). 

Estimates of k and a were obtained by the methods of moments 
from the observed distribution of z. 


They are 
(5.18) 
k= (5.19) 


where £=)>2,/n = (x — @)*/n (compare (3.16)). 
Their standard errors follow from (3.17). 
The numerical results are 


a= .2042+4 .0158 (5.20) 
k 


The resultant bivariate distribution np(z» , 7) is given in Table IT, 
together with np(xo) np(z,) and np(x)}. The “fit” seems quite satis- 
factory, as judged by chi-square tests given in Table IV. 

Note also that two further estimates of a are 


3.5240 + 1.3134 (5.21) 


*P = .15,.30 and <.001 respectively. 
tnp(z) is, of course, the distribution obtained earlier in Section ITT. 
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212+ 019 and 


+ .020 


In fact, there is apparently nothing in these observations which 
seriously contradicts the hypothesis that we are dealing with a com- 
pound Poisson process in which p(A) is given by (5.4) with a = .20 
(approx) and k = 3.5 (approx). a and k are constants independent of 
time. a has the simple physical significance: average annual accident 
rate. 

All this seems reasonable enough for these particular observations. 
Adelstein knows of no alteration in the environment which would have 
made a alter in time. The hypotheses employed specifically reject the 
possibility that having an accident alters the probability of an indi- 
vidual having another accident, which seems plausible here since we 
are dealing with experienced men who are presumably “set in their 
ways”. 

It seems almost a pity to have to recall the fact that an entirely 
different hypothesis leads to exactly the same type of distribution for 
p(x) and exactly the same numerical values when applied by the same 
methods to the same data. 

However, it is this fact that led to the search for extra information 
(as given in a bivariate table for example, in contrast to the standard uni- 
variate table) to enable one to judge between the two hypotheses. The 
extra information, as we have just seen appears to support the compound 
Poisson hypothesis very satisfactorily. Of course, it may make the 
“contagious” hypothesis appear still more plausible; this point will be 
discussed in the next sub-section. Leaving that question aside, the 
rest of this sub-section will be given over to a brief discussion of the 
type of information obtainable from the bivariate negative binomial 
distribution of the compound Poisson type, given by (5.11) and the 
distributions associated with it. 

As already emphasized in Section III, the distribution of individual 
“liabilities” can be examined, by means of the p(A) of equation (5.4) 
which shows that 


y=2 ‘ d has a chi-square distribution with 2k d.f. which for k = 
3.5 and a = .20 becomes 
v = 35) has a chi-square distribution with 7 df. (5.22) 


It may be remarked that it took a long time (eleven years) to prove 
that these data were non-homogeneous; one feels inclined to term them 
‘nearly homogeneous’; yet if the compound Poisson hypothesis holds 
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“‘wide” differences .exist between the individual liabilities. The moral 
is, do not use vague subjective phrases like “nearly” and “wide” when 
one can measure accurately as in this case (by use of (5.22).) 

(5.22) deals with the distribution of \’s in general, but (5.14) looks 
even more interesting. It gives p(A|zx) i.e. the distribution of liabilities: 
among those who had exactly x accidents in the period 0 to T. That 
is, it enables one to estimate a man’s “liability” to have accidents from 
observing the number of accidents he sustains in a given time interval. 
‘This is what various writers have tried, time and again, to do: just how 
accurately can it be done? 

From (5.14) 


v = 2(k/a + 6)d has a chi-square distribution with 2(k + x) degrees 
of freedom, so for k = 3.5, a = .20,6 = 11 


vy = 57) has a chi-square distribution with 7 + 2z df. (5.23) 


From this, results such as are shown in Table V follow: 


TABLE V 
90% confidence limits for 90% confidence limits for » 
for known z for known z 
0 .088 — .247 5 .092 — .369 
1 .048 — .272 . 6 .103 — .392 
2 .058 — .297 — .416 
3 .069 — .321 8 .127 — .439 
4 .080 — .345 9 .142 — .461 


In the light of the above figures, it would be dangerous to assert, 
for example; that if within the eleven year period A has 2 accidents 
and B has 7 accidents, then this proves that B is intrinsically ‘‘more 
liable” to have accidents than A is. 

This hammers again at the point raised in Section II, that pure 
chance plays such a large part in accident statistics that it is dangerous 
to state dogmatically that “‘B is accident prone, while A is not”. 

Admittedly, the above methods applied to other data may on oc- 
casion give much more clear cut results than are obtained for these 
shunters. The method is powerful and it measures accurately such 
information as can be obtained. 

From p(d | 2) we naturally consider p(A, z). This is simply 
| x)p(x). Note that 


R 
| 
: 
: 
~ 
+> 


BIOMETRICS, DECEMBER 1951 


1 


pA, z) = 
ad 


was obtained by Newbold and applied to various data. To this fact 
we offer the following comment: Just what is the use of p(A, x), when 
one has found it? For the shunters, p(A, z) = .62. Obviously, quite a 
“high tendency to concomitant variation between z and X”. But if 
one wishes to get any clear cut information, such as “among those for 
which z > 5 what proportion have \ > .4?” then one must in general 
turn to the complete bivariate distribution p(A, x). Nothing less will 
do. p(z, \) is only one constant descriptive of certain aspects of a 
complicated situation. More descriptive constants are required before 
the situation can be adequately described. If we use regression theory, 
which employs p(A, x) and other constants, then just what sort of re- 
gression theory do we employ? Clearly not normal regression theory, 
for z is discrete. If we are going to build the correct regression theory 
we shall need p(a, x); if we are going to ignore regression theory we still 
need p(A, x) if we are to be able to study all aspects of the co-variation 
of x and i. 

Very similar remarks apply to 2,) and p(2o, p(%o, .26 
for whatever it is worth, but surely the bivariate distribution tabulated 
in Table II is more useful, and not very difficult to compute. From it 
we see for example, that of those who have 5 or more accidents in the 
first period we expect that 5% will have 5 or more accidents during 
the second period. 

A succession of probabilities of this type can be computed if re- 
quired; the whole process can be repeated for different time intervals. 

Tosum up. The data are arranged in a bivariate table as in Table I. 
The hypothesis is then tested that we deal with a compound Poisson 
situation as defined in equations (5.2), (5.3) and (5.4). If we decide to 
accept this hypothesis then not only do several distributions such as 
, exist describing what has happened, but analogous distribu- 
tions such as p(x, x2) can be obtained from which predictions can be 
made as to what will happen in some future time interval. In our 
opinion these various bivariate and contingent distributions which have 
been developed by Arbous and Sichel (59) in the related fiel! of ab- 
senteeism have not been fully utilised in the study of accidents. 


(B): A Bivariate Contagious Distribution. In this sub-section the 
“contagious” hypothesis which led to the negative binomial distribu- 
tion given in equation (4.24) is analysed further. When (4.24) is applied 
to the data for 122 shunters for 11 years the identical np(z) is obtained 
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as followed from the compound Poisson hypothesis (as shown in Tables 
II and III) and we have to choose between these hypotheses. 

When applied to these shunters of long experience this contagious 
hypothesis asserts (a) all shunters had no accidents by the beginning 
of the eleven year period (b) the more accidents they had the more 
likely they were to have more accidents. 

The reader may well exclaim that (a) and (b) are such obvious 
nonsense when applied to shunters that he rejects the contagious 

hypothesis out of hand. 

While sympathising, the point is made that logically one is only 
entitled to reject hypotheses made concerning these shunters in the light 
of the information one possesses about these particular men. If one 
quotes what other people have said about other shunters, then one 
argues by analogy, which sometimes leads us badly astray. 

So, at the risk of being pedantic, we point out to the reader the 
extra information about these men obtained by splitting the eleven year 
period into two shorter periods, as given in Table I, and proceed to 
develop the corresponding “contagious” distributions and to test their 

With 2 , 2,2 = % + 2%, 5, 6,, and 6 = 5 + 4, defined as in 
Section V and Section I, then from (4.1), (4.2), (4.3) and (4.24) it 
follows that 


p(x) is given by (4.24) with 6 replacing ¢ (5.24) 


p(2o) is given by (4.24) with 6, replacing ¢ and 2, replacing x (5.25) 
p(2, | Zo) is obtained from (4.5) with 


f(t, t) = B+ + (5.26) 
so that (4.6) becomes 
| to, u) = — ue™ — (5.27) 
whence 


whence 
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21 Uo Ur) = — — 1) — — (5.30) 
In this, putting u. = u, = wu gives G(x, u) and hence p(z) as in (5.24) 
u, = 1 gives G(x» , Uo) and hence p(z») as in (5.25) 
Up = 1 gives G(z, , u,) and hence p(21) if required. 


_ Replacing u by e* transforms a p.g.f. into the corresponding m.g_f., 
whence from (5.27) and (5.30) it follows for example that 


= (e7* — 1) and a similar formula for 2, 


= B — and a similar formula for 


| = (84 — 


wiles | 20) = (2 + De" (5.31) 
, 21) = 


We illustrate these results, as before, by applying them to the data 
given in Table I. Estimating the parameters by the method of mo- 
ments just as in chapter 3 we obtain— 


= (5.33) 
_ 
B= (5.34) 


whence 7 and ® are readily obtained. The corresponding bivariate 
negative binomial distribution is given in Table III, where the reader 
can compare it with the original data given in Table I. 

In Table IV np(zx) and the marginal distributions np(z,) and np(z,) 
are compared with the corresponding observed distributions; np(z) is, of 
course, identical with the np(x) obtained by the method of moments 
from the compound Poisson hypothesis, and there is nothing in these 
figures to guide our choice between the “compound” and the “con- 
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tagious” hypotheses. The marginal distributions np(xz)) and np(z,) fit 
the data fairly well, as judged by the chi-square test, but the np(z,) 
and np(z,) obtain from the “compound” hypothesis fit the data rather 
better. Here at last is evidence, from the given observations that the 
“compound” hypothesis is more plausible than the “contagious” hy- 
pothesis. As pointed out earlier, the ‘contagious’ hypothesis can be 
criticised on various grounds if applied to these data; (for instance, 
equation (4.1) does not hold) but the point is made again that all such 
criticism should be backed by carefully analysed observations. 

Other points worth noting are—(i) We obtained the extra evidence 
we needed by dividing the whole period of observation into two sub- 
periods. (ii) In spite of all the arguments advanced against the use 
of this particular contagious distribution to these data, the fact re- 
mains that the formulae “fitted” the data quite well. We hope that 
the reader will remember this as a salutary warning whenever he feels 
inclined to dogmatise about the “explanations” of his own observations. 
(iii) The properties of this bivariate ‘“‘contagious” distribution and its 
associated marginal and conditional distributions can be rapidly de- 
veloped from , ; Uo , U1). 

This completes our discussion of the methods contained in standard 
literature. It is hoped—(a) that a reasonably coherent account of what 
has been done is given, and (b) that the tools have been furbished up as 
it were, and ways and means of making fresh applications of them 
suggested. 


SECTION VI: CONCLUSIONS 


In Part II of the monograph we have—(a) Collected and examined 
the standard mathematical theory used in analysing accident statistics 
and developed it in an orderly manner. (b) Shown how it is connected 
with the general theory of stochastic processes. (c) Applied the theory 
to a few practical examples. 

In a sense this theory will always be useful. People will on occasion 
collect accident statistics over a period of time and need to describe 
them, by means of these or similar theories. 

In a sense this theory misses the mark altogether. It deals with 
people who have been under observation for the whole of a period of 
time and says nothing about the misfits who try a job and soon leave 
it. Now one of the fundamental problems that faces the industrial 
psychologist is how to spot the misfit, preferably before he begins a job, 
and the above theory will do little or nothing to help him here. 

Considerable pains have been taken to discuss the physical sig- 
nificance of the mathematical formulae developed, for the benefit of 
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G(ao 21 5 Uo , Ui) = — — 1) — — (5.30) 
In this, putting wu» = u, = u gives G(z, u) and hence p(z) as in (5.24) 
u, = 1 gives G(x» , Uo) and hence p(z») as in (5.25) 
Ue = 1 gives G(z, , u,) and hence p(z,) if required. 


. Replacing u by e* transforms a p.g.f. into the corresponding m.g.f., 
whence from (5.27) and (5.30) it follows for example that 


= anil a similar Sermule for 2, 


= — and a similar formula for 


wile = (84 0 


, 2) = 1 


We illustrate these results, as before, by applying them to the data 
given in Table I. Estimating the parameters by the method of mo- 
ments just as in chapter 3 we obtain— 


ig (5.33) 


ll 


(5.34) 


whence 7 and 8 are readily obtained. The corresponding bivariate 
negative binomial distribution is given in Table III, where the reader 
can compare it with the original data given in Table J. 

In Table IV np(xz) and the marginal distributions np(x,) and np(z,) 
are compared with the corresponding observed distributions; np(z) is, of 
course, identical with the np(x) obtained by the method of moments 
from the compound Poisson hypothesis, and there is nothing in these 
figures to guide our choice between the “compound” and the “con- 
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tagious” hypotheses. The marginal distributions np(x) and np(z,) fit 
the data fairly well, as judged by the chi-square test, but the np(z») 
and np(z,) obtain from the “compound” hypothesis fit the data rather 
better. Here at last is evidence, from the given observations that the 
“compound” hypothesis is more plausible than the “contagious” hy- 
pothesis. As pointed out earlier, the ‘contagious’ hypothesis can be 
criticised on various grounds if applied to these data; (for instance, 
equation (4.1) does not hold) but the point is made again that all such 
criticism should be backed by carefully analysed observations. 

Other points worth noting are—(i) We obtained the extra evidence 
we needed by dividing the whole period of observation into two sub- 
periods. (ii) In spite of all the arguments advanced against the use 
of this particular contagious distribution to these data, the fact re- 
mains that the formulae “fitted” the data quite well. We hope that 
the reader will remember this as a salutary warning whenever he feels 
inclined to dogmatise about the “explanations” of his own observations. 
(iii) The properties of this bivariate “contagious” distribution and its 
associated marginal and conditional distributions can be rapidly de- 
veloped from , 21 ; Uo , 

This completes our discussion of the methods contained in standard 
literature. It is hoped—(a) that a reasonably coherent account of what 
has been done is given, and (b) that the tools have been furbished up as 
it were, and ways and means of making fresh applications of them 
suggested. 


SECTION VI: CONCLUSIONS 


In Part II of the monograph we have—(a) Collected and examined 
the standard mathematical theory used in analysing accident statistics 
and developed it in an orderly manner. (b) Shown how it is connected 
with the general theory of stochastic processes. (c) Applied the theory 
to a few practical examples. 

In a sense this theory will always be useful. People will on occasion 
collect accident statistics over a period of time and need to describe 
them, by means of these or similar theories. 

In a sense this theory misses the mark altogether. It deals with 
people who have been under observation for the whole of a period of 
time and says nothing about the misfits who try a job and soon leave 
it. Now one of the fundamental problems that faces the industrial 
psychologist is how to spot the misfit, preferably before he begins a job, 
and the above theory will do little or nothing to help him here. 

Considerable pains have been taken to discuss the physical sig- 
nificance of the mathematical formulae developed, for the benefit of 
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the non-mathematician, and great stress has been laid on the tentative 
nature of the conclusions arrived at in practical cases. This is of course 
flogging a dead horse. Ali of the writers quoted advanced their hy- 
potheses tentatively and hedged them about with warnings and advice. 
This has not prevented other writers from beginning an article with: 
“as so and so has shown .. .” and on occasion going on blithely to absurd . 
conclusions. A more fundamental approach has been suggested by 
Arbous and Sichel (59). Here the mathematical tools for this approach 
have been sharpened, and are now available for future investigators who 
wish to apply them. 
SECTION VII: LIST OF TABLES 
TABLE I. 


ACCIDENTS AMONG 122 EXPERIENCED SHUNTERS 
1937-42 (6 years) = 8 1937-47 (11 years) = 6 
Zo 
12232486 6 z 
= 0/21 188 8 2 1——| 50 0 21 
14 10 1 4 1 31 
? 1177 2 26 
3) 2 1 141 9 3 19 
2 4 7 
0 5 9 
1 7 1 
Ss 40 39 26 8 6 2 1 122 8 3 
122 =n 
TABLE II. 


COMPOUND POISSON HYPOTHESIS 
Bivariate negative binomial distribution np(xo,x1) fitted to data of Table I, as shown in 


Section V(A). 

x 0 1 2 3 4 5 6 7+ np(x1) x np(z) 
0 | 21.47 16.06 7.71 3.01 1.04 .33 .10 .05 49.77 0 21.47 
1 | 13.38 12.86 7.54 3.48 1.39 .50 .17 .11 39.43 1 29.44 
2} 5.36 6.28 4.35 2.31 1.04 .42 .16 .11 20.03 2 25.93 
3] 1.74 2.40 1.93 1.16 .58 .26 .11 .06 8.24 3 18.57 
4 49 .80 .72 .49 .27 .13 .06 .03 2.99 4 11.76 
5 13 .24 .12 .06 .03 .01 1.01 5 6.89 
6 038 .07 .08 .06 .04 .02 .01 .00 31 6 3.79 

7+ .06 .05 .03 .01 .00 .00 22 #7 2.01 

np(to) 42.62 38.76 22.63 10.74 4.51 1.73 .64 .37 122.00 8 1.01 
9+ 1.13 
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TABLE III. 
“CONTAGIOUS” HYPOTHESIS 


Bivariate negative binomial distribution np(zo,1) fitted to data of Table I as shown in 


3 4 5 6 7+ c np(z) 

0 | 21.47 14.26 6.08 2.07 .64 .19 .05 .04 44.80 0 21.47 
1} 15.18 12.95 6.75 2.71 .97 .32 .09 .06 39.03 1 29.44 
2] 6.90 7.19 4.42 2.04 .82 .30 .10 .08 21.75 2 25.93 
23.86 3.15 2.28 1.17. .21 .04 9.96 3 18.57 
4 .84 1.20- .95 .56 .25 .12 .05 .02 3.99 4 11.76 
5 .26 .41 .37 .24 .18 .06 .03 .01 1.51 5 6.89 
6 .03 .00 51 6 3.79 
7 05 .08 .09 .06 .03 .02 .01 .01 35 7 2.01 
47.33 39.37 21.02 8.94 3.42 1.25 .41 .26 122.00 8 1.01 
9+ 1.13 


TABLE IV. 
COMPARISON OF “GOODNESS OF FIT” OF THE DISTRIBUTIONS GIVEN ABOVE 


Neg. B Compound Contagious Compound Contagious 
z fo x? | fo f, x? | fo x? x? 
0 21 | 21.47 01; 40 | 42.62 16) 47.33 |1.15) 50 | 49.77 44.80 60) 
1 31 | 29.44 08} 39 | 38.76 — | 39.37 — | 43 | 39.43 33} 39.03 41 
2 26 | 25.93 — | 26 | 22.63 .51] 21.02 {1.19} 17 | 20.03 45) 21.75 |1.05 
3 19 | 18.57 01} 8} 10.74 -70| 8.94 10; 9 8.24 07) 9.96 10 
4 11.76 /|1.93) 6 4.51 3.42 2 2.99 3.99 
5 9 6.89 65} 2 1.73 1.25{ |1.47] 1.01 52} 1.51{ {1.78 
6 5 3.79 1 64 .41 0 .31 51 
7 1 2.01 14 38 -29 1 .22 .35 
8 3 1.01 
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1.13 
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TABLE VI. 
DISTRIBUTION OF ACCIDENTS AMONG 552 SHUNTERS DURING THEIR FIRST 
YEAR OF EMPLOYMENT 


Age group 21-25 Age group 26-30 Age group 31-35 Combined 

Age group 25-35 
No. of “Burnt “Burnt “Burnt “Burnt 
Accidents |Ob.*/Poisson} Fin- {Ob.*/Poisson| Fin- |Ob.*|Poisson| Fin- |Ob.*/Poisson| Fin- 
gers”’ gers” gers” gers” 

0 80 | 80.1 | 80.0 |121 {126.9 {121.0 | 80 | 86.7 | 80.0. |201 |213.7 |201.0 

1 56 | 60.3 60.3 85 | 73.8 83.5 61 | 50.4 61.0 |146 |124.2 |145.4 

2 40 | 22.7 22.7 19 | 21.4 19.1 13 | 14.6 12.3 32 | 36.0 30.8 

3 4 5.7 5.7 1 4.2 3.0 1 2.8 1.6 2 7.0 4.3 

1.1 1.1 0 4 5 0 1.0 .5 

5 | 0 | 0 oll 

6 1 1 


= .7529 § = 754) = .5815 § = .629) = .5806 § = .661) = .5812 § = .642 
S? = 6801 @ = .751|S? = .5694 = .444/S? = 4500 @ = .361/S? = .5209 = .410 


x? 3.88 3.95 3.96 0.61 3.60 0 8.23 osa 


*Observed. 


SECTION VIII APPENDIX 


For the convenience of the reader some of the more important 
formulae used in the text are collected here. 
(a) Basic formulae in the theory of probability. 
If E, and E, denote two “events” then 
E, + E, denotes the joint event “either EZ, or E,” 
E,E, denotes the compound event “‘E, as well as E,” 
E, | E, denotes the contingent event “‘E, , given that FE, happens” 
P(E) or p(E) denotes the numerical probability that event E hap- 
pens. 
Then, if F, and E, are mutually exclusive 
P(E, + E.) = P(E,) + P(E.) (The Addition Principle) (1) 
while, in all cases 


P(E,E.) = P(E,)P(E; | E,) 

P(E.)P(E, | E.) (The Multiplication Principle) (2) 
Events are defined as stochastically independent if 

P(E, | E,) = P(E2) 

so that P(E,E.) = P(E,)P(E:) (3) 
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If z is a discrete statistical variable that can take on the values 
Xo X Xz... with respective probabilities 


p(t:) p(t2) .. (4) 
then p(x) is termed the law of distribution of z. 
(b) Moments and Moment generating functions. 
In what follows yo denotes summation over all the values that the 
discrete variable affected by the summation can take on. In poastion 
these values are often the successive integers 0 1 2. 


The distribution p(x) has various “moments” ndunit with it 
such as 


= > zip(x.) = k-th moment about the origin. 
ut is the expected value of z, or its true mean. (5) 
we = >> (x; — ut)*p(x,;) = k-th moment about the mean. (6) 
Me is the true variance of z. 


= —1)- + 1)p(a,) = k-th 
factorial moment. : (7) 


The moments are useful descriptive constants associated with the 
distribution. By means of equations (5), (6) and (7) one type of mo- 
ment can be expressed in terms of moments of another type. 

For example 


Uy (8) 
and 
Me = + — (9) 


Associated with a distribution p(x), we have its probability gene- 
rating function (p.g.f.) defined as 


u) = = u*p(ao) + (10) 


which “generates” probabilities i in the sense that the coefficient of u”* 
is p(x,). 

Replacing u by e* in the ‘ite we have the moment generating 
function (m.g.f.) defined as 


M(z, u) = 


in which the coefficient of u*/k! is ui , the k-th moment about the mean. 
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Replacing e* by (1 + u) in the above we have the factorial moment 
generating function, (f.m.g.f.), defined as 


M[z, u] = (1 + u)* 


=1+u pla) + — 1) ple) + (12) 


in which the coefficient of u*/k! is u;.;, the k-th factorial moment. 
If z = f(z) is a given transformation, and we require the law of 
distribution of z, this has generating functions 


Ge@,u) = Dupe), = Le" plz), 


Miz, u] = (1 + ple.) (13) 


If x and y are two discrete statistical variables with a joint law of 
distribution p(z, y) then this distribution has generating functions de- 
fined by 


G(z, U, v) p(x; Yi) (14) 
M(z, y, v) p(x; Yi) (15) 


M(z,y,u,»7 = (16) 
If z and y are stochastically independent so that p(x, y) = p(zx)p(y), 
then 
G(z, y, u, v) = G(x, u)G(y, v) and two similar results (17) 
If v = wu in (18), (14), (15) we obtain the generating functions for 
the distribution of z = x + y, namely, 
G(z, u) = >> > u***"'p(x.y,) and two similar results (18) 


The idea of a generating function links up with the theory of Laplace 
Transforms, but there is no need to trace the connection here. 
(c) The continuous statistical variable. 

If x is a continuous statistical variable so that the probability that 
z lies in value between and 2, , is p(x) dx 
p(x) is termed the probability density of the distribution of z. 

The m.g_.f. of z is 


M(x, u) = e™ p(x) dx (11a) 


and the f.m.g.f. is 
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Miz, uj = [ (12a) 


but there is no corresponding probability generating function. The 
theorems proved for m.g.f.’s and f.m.g.f.’s when z is discrete hold for 
the case where z is continuous, with f - - - dz replacing >>. 
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QUERIES 


GreorceE W. SnepDEcor, EpiTor 


QUERY: Fisher (Design of Experiments, Section 50) shows how 

91 to assess the effects of quality and the quality and quantity 
interaction on the hypothesis of proportional response in the case 

of 4 qualities of nitrogen at 3 equally spaced intervals, the lowest of 
which is zero. How would this be done if the lowest level were not zero? 


Suppose for any fertilizer we have 
ANSWER: Quantity, 0 1 2 3 

Yield a b c 
with totals A, B, C for all four fertilizers together. Then I imagine it 
will be agreed that the 2 d.f. denominated QUANTITY will have the 
sum of squares 


1 2 1 2 


The remainder for QUALITY and INTERACTION will then have 
— 1 4? + so) —1B°+ se) 
4 4 4 
which may, of course, be subdivided, as for example, 


5 S(e a)? — (C A)? 
— 2b +0? — 2B +0) 
24 


The question is what apportionment of this total is most proper for 
the separation of QUALITY and INTERACTION. I think opinions 
may legitimately differ. Evidently, the 3 d.f. having 
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2 _ 2 


represent differences in quality as measured by linear response. A 
second orthogonal set having , 


Lig, _ op 2 
g Sta 2b + c) 94 (A 2B: + C) 
represent differences in quadratic response, which may or may not be 


thought to be properly included in pure QUALITY. In any case, there 
remain three mere d.f. having 


ay S4a + b (4A + 20), 
which seem to me to be properly described as INTERACTION or 
RESIDUE. 
All the algebra needed is that for the identity 


= (a + 2b + 6)" + (4a + b — 25%, 


but the whole process will look more convincing with numerical data. 
R. A. FisHER 
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ABSTRACTS 


OLIVER, JAMES T., (Chief, Statistical Section, Los Angeles 
161 County Health Department). Mass Action Law Applied to 
Tetanus Incidence. 


Tetanus incidence would be classed by Sir Ronald Ross as an 
“Independent Happening”, but such incidence seems to follow a law of 
mass action which parallels that for the ‘Dependent Happenings” that 
cover our communicable diseases. 

The law of mass action for tetanus can be expressed as 

C = PEk (1) 
where C = number of cases 
P = Total number of individuals in the population 
E = Total number of exposures to cuts, scratches and other 
trauma to which the population is subject 
k = probability of infection 

Since E and k represent an average of the exposures and infection 
probabilities of the various sub-groups in the population, (1) would 
become 


C=PEk (2) 


Ross’ discussion and formulation of dependent happenings can be 
redefined to account for the immunes in the population and to evolve.a 
formula for a case rate at time ¢ which is similar to one developed for 
‘dependent happenings. 
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NEWS AND NOTES 


ALASKA—Samuel C. Litzenberger, Agronomist, Alaska Agricultural 
Experiment Station, Palmer, is a new member of this Society. He is 
currently inte ‘in the use of statistical tools in planning and evaluat- 
ing his cereal crop research in genetics and plant breeding. . . . Edward 
S. Weiss was commissioned in the U. S. Public Health Service and 
assigned to the Arctic Health Research Center, Anchorage, Alaska. He 
has been Chief of the Biometrics Branch for the past year. 


AUSTRALIA—R. Birtwistle, University of Adelaide, South Australia 
is now with Division of Building Research, Section of Mathematical 
Statistics, Victoria, Australia. ... G. S. Watson who during 1949 and 
1950 was a research officer of the Department of Applied Economics, 
University of Cambridge has recently been appointed as Senior Lecturer 
in the Department of Mathematical Statistics, University of Melbourne. 


BRAZIL—Agesilau A. Bitancourt, Director General, Instituto Bio- 
logico, Séo Paulo, Brazil has joined our Society. He is interested in the 
use of statistical tools for planning and evaluation in research on diseases 
of citrus and other plants and on the physiology of fungi. 


CANADA—J. H. Doughty, Director of Vital Statistics of the Pro- 
vincial Department of Health and Welfare, Victoria, B. C., Canada, 
joined The Biometric Society this year. He is interested in vital and 
public health statistics. ...G. F. M. Smith, Department of Biology, 
University of New Brunswick, Fredericton, Canada, in addition to being 
Professor of Zoology is working on a part-time basis with the Fisheries 
Research Board of Canada on statistical problems of fisheries research 
and design of experiments. 


ENGLAND—Katharine H. Coward has retired from the position of 
Head of the Nutrition Department in the College of the Pharmaceutical 
Society. ... D. J. Finney has been appointed Statistical Consultant to 
the National Foundation for Educational Research. He still holds the 
odd opinion that Biometrics should seek to excel as a scientific journal 
and need not rival the popular dailies by publishing trivial gossip. . . . 
P. M. Grundy is now on the staff of the Statistical Department, Rotham- 
sted Experimental Station. . . . A. Bradford Hill is now in his second 
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year of office as President of the Royal Statistical Society. ...R. M. 
Jones is engaged in statistical work for the Medical Research Council’s 
Climate and Working Efficiency Unit. . . . Doris Lee is visiting the Insti- 
tute of Statistics during 1952. She is from the Institute of Education, 
University of London and has come over to the U. S. to observe how the 
statistical tools are used in practical biological research. . . . M. R. 
Sampford has received the degree of D.Phil. at Oxford for his work on 
“Truncated probability distributions and time-response curves’. .. . 
- J. Taylor is now Statistician to the Department of Agriculture and For- 
ests, Sudan. .. . M. C. K. Tweedie has been awarded a Leverhulme 
Fellowship for research in Mathematical Statistics. ...D.R. Westgarth 
is Statistician to the Rubber Research Institute of Malaya. 


INDIA—Raghu Raj Bahadur has been teaching statistics at the 
University of Chicago since completing his Ph.D. in Mathematical 
Statistics at the University of North Carolina in 1949... . Ishu Bangdi- 
wala received his M.S. in Experimental Statistics from North Carolina 
State College in 1950 and has since been working towards his Ph.D... . 
Uttam Chand has been working as Assistant Professor of Mathematical 
Statistics at Boston University since completing his Ph.D. in Mathe- 
matical Statistics from the University of North Carolina in 1949... . 
Sudhir Ghurye is working towards his Ph.D. in Mathematical Statistics 
at the University of North Carolina. . . . Gopinath Kalyanpur received 
his Ph.D. from the University of North Carolina in Mathematical 
Statistics in 1951. He is leaving for the University of California to 
teach Mathematical Statistics for about a year. . . . Subash Mazumdar 
received his M.S. in Experimental Statistics from North Carolina State 
College in 1950 and is at present with the Social Affairs Department of 
the United Nations Organization. .. . D. N. Nanda received his Ph.D. 
in Mathematical Statistics at the University of North Carolina in 1948. 
Returned to India to work as Statistician in the Government Ordnance 
Factory at Kanpur. He is doing applied work on sampling inspection, 
designing of specifications, sample surveys for estimation of losses in 
Army depots, and large scale users trials for testing garments. . . . 
R. D. Narain leaves the Indian Council of Agricultural Research where 
he was an Assistant Professor to join as Professor of Statistics and 
Mathematics at Dharwar in the newly founded Karnatak University. . . . 
W. R. Natu, Statistical and Economic Advisor to the Government of 
India, Ministry of Food and Agriculture, has joined the International 
Monetary Fund as an Executive Director. . . . V. G. Panse joins the 
Indian Council of Agricultural Research, New Delhi, India, as Statistical 
Adviser, leaving the Directorship of the Institute of Plant Industry at 
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Indore, where his contributions to quantitative genetics are well known. 
His association with sample survey work dates from 1942, when he first 
initiated a large scale survey for the estimation of yield of cotton in 

Madhya Pradesh. .'. . G. N. Sankpal, Director of the Bureau of Econom- 
ics and Statistics, Government of Bombay, has left for Canada and — 
U.S. A. ona U.N. Fellowship for studying statistical work and organiza- 
tions in those countries. . . . A. R. Sen is working towards his Ph.D. in 
Experimental Statistics at North Carolina State College. . . . Sharad- 
Chandra Shankar Shrikhande received his Ph.D. in Mathematical 
Statistics at the University of North Carolina in 1950 and is now working 
as Professor of Mathematics at the University of Nagpur, India. . . . 
P. V. Sukhatme, for a long time Statistical Advisor Indian Council of 
Agricultural Research, New Delhi, India, has joined the FAO at Rome 
recently as Chief of the Statistics Branch. Members of the Society will 
be glad to know that he was recently elected an Honorary Fellow of the 
American Statistical Association for his pioneering work on sample 
surveys in India. . . . Shantilal Amidas Vora returned to India last Janu- 
ary having taught Statistics at Stanford University the previous year. 
Mr. Vora received his Ph.D. in Mathematical Statistics at the University 
of North Carolina in 1950. . . . The Indian Society of Agricultural Sta- 
tistics conducted a five-week summer session on sample surveys in May- 
June, 1951. The topics covered were: (i) official methods of maintaining 
agricultural statistics relating to acreage and production and attempts at 
improvement through sample surveys, (ii) relative merits of the use of 
the sample surveys for area and yield estimation against the background 
of existing conditions, (iii) advanced treatment of sampling techniques, 
researches in India and abroad, (iv) current sampling enquiries and scope 
for further work. A knowledge of elementary statistical methods was a 
pre-requisite. The lectures on aspects (i) and (ii) above were given by 
V. G. Panse and on (iii) and (iv) by P. V. Sukhatme. The fourth annual 
conference of the Indian Society of Agricultural Statistics is proposed to 
be held in December just before the session of the International Institute 
of Statistics at New Delhi, India. 


ISRAEL—Louis Guttman, Scientific Director of Israel Institute of 
Applied Social Research and formerly Professor of Sociclogy at Cornell, 
visited the Cornell campus in March. Mr. Guttman has been in Israel 
for over 3 years, during which time he has greatly expanded the theory 
of scale analysis. 


ITALY—The first Italian meeting of The Biometric Society was held 
in Milan on March 14, 1951. The morning session was opened by G. 
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Barbensi, a physician who has dedicated the last twenty years to the 
teaching and application of biometrical methods, and is the author of the 
first book published in Italian on the ‘Elements of biometrical metho- 
dology’”’. M. Boldrini, professor of statistics at Milan Catholic Univer- 
sity was in the chair. Mr. Barbensi sketched a short history of early 
biometrical work in this country and discussed what he considered to be 
the main bars to the development of biometrical work and thought in 
Italy. He proposed that a vote to the Ministry of Education be ex- 
_pressed by the assembly asking for the establishment of regular teachings 
. of Biometry in the Scientific Faculties. Such a teaching usually exists 
but mostly only in theory. The proposal was later discussed and a draft 
signed by attending members. V. Tonolli described some applications of 
biometrical methods to quantitative investigations of plancton ecology 
and genetics, carried out in the past years at Instituto Italiano di 
Idrobiologia, Pallanza, Lake Maggiore. R. Scossiroli, formerly assistant 
at the Experimental Station for Maize Breeding, Bergamo, and now 
assistant at the Institute of Genetics of Pavia University, gave a sum- 
mary of methods and results obtained in testing hybrid maize yields, 
and in the investigation of some biological problems of maize. It may 
be recalled that hybrid maize was imported to Italy only after the war 
and was the first strong stimulus to the application of modern designs to 
agricultural field experiment. E. Baldacci, Professor of Plant Pathology 
at Milan University, and M. Orsenigo described results obtained in 
applying statistical tests to the taxonomy and systematics of the genus 
Endothia (Ascomycetes). Standard differential characters were shown 
to be often unreliable, and some of the claimed specific differences ruled 
out by statistical analysis. . . . The afternoon session was held after a 
lunch offered by Instituto Sieroterapico Milanese, and a demonstration 
of the IBM statistical machines at the same Institute. Mr. A. Buzzati- 
Traverso held the chair. Mr. A. De. Barbieri, Biochemist, Director of 
research at Instituto Sieroterapico Milanese, reported on investigations 
on the relationship between blood sugar level after insulin injection and 
frequency of convulsions in the rabbits. The relation is exponential, 
and its parameters vary with insulins of different degree of purity. 
Mr. V. Gallo, Hematologist, Assistant at University Hospital, Pavia, 
described a method designed to estimate the average life of basophilic 
erythroblasts. G. Di Vita of Psychiatric Hospitals, Genova Quarto, 
illustrated data collected to test the alleged influence of the lunar cycle 
on epileptic fits. The data showed significant periodicities which are 
being submitted to harmonic analysis. L. L. Cavalli dealt with applica- 
tions of maximum likelihood to the problems of estimating bacterial 
numbers and disinfection curves, on the basis of dilution and plate 
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counting methods. .. , Attendance at the meeting was relatively large; 
of the new members who joined this year and were at the meeting, 
Mr. G. Pettenella is a Bacteriologist, working on vaccines production 
and control at Instituto Sieroterapico Milanese; Mr. M. Bracco is a 
Biochemist, Director of the laboratory of Villaggio Sanatoriale, Sondalo, 
a huge sanatorial hospital at 5000 ft. with more than 2500 beds, where a 
statistical department is being planned. Mr. A. Cresseri is a Human 
Geneticist at the Medical Faculty of the University of Milan; Mr. G. 
Magni, Assistant, Genetics Institute of Pavia University is working on 
yeast genetics and on effects of radiations on microorganisms. . . . Visits 
of foreign members have taken place. A visit by Miss M. E. Bernstein 
at Rome University was announced; Mr. Frederick Straus gives a move 
from Washington, D. C. to Rome. ... A note should be made concerning 
the present position of biometry in this country. The teaching of sta- 
tistics could be expanded and its standard improved, only if an adequate 
number of sufficiently trained teachers were available; before then, 
pressure on the educational authorities might be of little utility. Short 
concentrated courses (to take place during vacations) are being planned, 
and information is being sought on prospective attendance. Two condi- 
tions should be met, however, before such or similar courses can come 
into existence; the availability of adequate funds and that of Italian- 
(or French-) speaking visiting lecturers. Suggestions and offers on these 
two matters would be welcome. . . . The next general meeting of the 
Biometrical Society is to take place in Italy in 1953. The IX Congress 
of Genetics will take place some time about the end of August, 1953, in 
Bellagio, North Italy. . . . Renzo Scossiroli, Stazione Sperimentale 
Maisoultura, Bergamo, Italy has joined the staff of Instituto di Genetica 
della Universita, Pavia, Italy. 


JAPAN—D. George Deihl, formerly with the National Cancer 
Institute of the U. S. Public Health Service, is now with the Economic 
and Scientific Section, General Headquarters, Supreme Commander for 
the Allied Powers, Tokyo, Japan, as a Survey Statistician. . . . Joseph 
C. Dodson accepted a position with the Department of State last May as 
Attache (Agricultural) in the U. 8. Embassy, Tokyo, Japan. Since 1948 
Mr. Dodson had been an Agricultural Statistician in the Natural Re- 
sources Section, General Headquarters, Supreme Commander for the 
Allied Powers, Tokyo, and before that was with the Department of 
Statistics of Iowa State College. In a recent letter Mr. Dodson writes, 
“‘ After the occupation ends I would like to see the Japanese assisted as 
far as possible in catching up on biological statistics. Perhaps an ex- 
change of professors would be one way. While with SCAP last winter I 
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tried to arrange tor l'okyo University to invite an American Statistician 
(like, say, Snedecor, Cox, or Cochran) to lecture here during 1951-52. 
The plan was supported up through President Nambara of the Uni- 
versity, but fell through when the Ministry of Education could not pro- 
vide the necessary funds. Dr. Deming has given some lectures in mathe- 
matical statistics at the University. What is most needed now, I think, 
is some constructive work on the applied side. I wonder if you have any 
views along this line.” . . . Miss Fumi Miyamoto, Assistant Professor 
of Mathematics, Nara Women’s University (Nara City, Japan), is 
studying Mathematical Statistics at the University of North Carolina 
in Chapel Hill under the sponsorship of American Association of Uni- 
versity Women. . . . Sigeiti Moriguti, Assistant Professor of Applied 
Maihematics at the University of Tokyo, is spending the academic year 
1950-51 in research and study of Mathematical Statistics at the Uni- 
versity of North Carolina under the sponsorship of the United States 
Army. He is the author of numerous research articles and a book on the 
theory of statistics. 


NETHERLANDS—M. G. Neurdenburg has left the statistical divi- 
sion of the Municipal Medical and Public Health Department of 
Amsterdam. He has recently been appointed medical cfficer of public 
health in the Netherlands State Public Health Department. His duties 
will be: (a) to start the Cancer-Registration in the Netherlands, and 
(b) to act as Head of the newly created Division of Public Health 
Statistics in same Department. As both jobs have to be built up from 
the bottom, he kindly requests that his name be put on your mailing-list. 
All material and information that you might think useful to him, will be 
welcomed most heartily. As editor-in-chief of the Netherlands fort- 
nightly of social medicine (Tijdschrift voor Sociale Geneeskunde) he will 
also be grateful for all informational material on this subject and on its 
activities in your country. 


PUERTO RICO—G. C. Kyker, Head of Department of Biochemistry 
and Nutrition upon request has'sent us some information on The School 
of Tropical Medicine. He writes “It is an institution which, during 
approximately a quarter of a century, has earned for itself wide acclaim 
in many countries. Its laboyatories and other facilities occupy a very 
large and attractive building in San Juan located next to the Capitol. 
The setting is unique in that facing north one looks out over the surf of 
the Atlantic, which is only a few feet from the entrance to the building; 
and by turning south one views the harbor of San Juan. Prior to 1950 
the school devoted itself largely to medical research as it pertains to 
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Puerto Rico in particular and to the tropics in general. Excellent labora- 
tory, library, and hospital facilities were available and the various 
investigations centered about many important problems in tropical 
diseases and nutrition. A prominent educational aspect of the school 
was the training of medical technologists. Certain changes have come 
with the advent of a hew medical school which began the teaching of its 
first class in August 1950. The school now carries the double title: 
School of Medicine—School of Tropical Medicine and is a part of the 
University of Puerto Rico. The major objectives of the former adminis- 
tration are retained and in addition a class of fifty medical students is 
entering during August of each year. The new function of medical 
education was organized and set into operation within a very short space 
of time. The official opening in August 1950 followed the arrival of 
many of the teaching and administrative officers of the school and its 
basic science departments only a few months. This short time was 
possible because of the resources of the physical plant of the School of 
Tropical Medicine. Likewise, the former objectives of the School of 
Tropical Medicine are being enlarged through the additional investiga- 
tive interests of personnel which have come to work in the Szhool of 
Medicine. Donald S. Martin, formerly of Duke University, is in charge 
of the School as Dean of the School of Medicine and Director of the 
School of Tropical Medicine. The Institution has a unique opportunity 
to serve both the immediate needs of Puerto Rico and a wider function as 
a crossroads to English speaking and Spanish speaking Americans. 
Changes in the physical plant of the School of Tropical Medicine which 
enabled the opening of the School of Medicine were limited. One wing 
was annexed to provide the basic science teaching laboratories. The 
hospital facilities which were accommodated in one wing of the older 
building were converted into a dormitory and cafeteria for students. 
In turn arrangements for ample teaching hospital facilities are being 
negotiated with the larger hospitals in San Juan. Just recently at which 
time the first class entered into its second year of medicine, the faculty 
for the third and fourth year of medicine was under final consideration 
and heads of departments of the third and fourth year faculty were 
selected. Throughout the history of the School of Tropical Medicine, 
Columbia University has given certain important assistance. In recent 
years, the school has been an insular institution under the administration 
of the University. Columbia continues to offer its interest. Harold W. 
Brown, Director of the School of Public Health at Columbia recruited 
personnel for the School of Medicine and he continues as Adviser to the 
Chancellor in charge of Medical Affairs. Another tangible expression of 
the interest which Columbia shows is that John Fertig, Professor of 
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Statistics at Columbia offers each year a course in statistics for the 
students here. During the current year Mr. Fertig is offering two 
courses in statistics. One course is designed for students in medicine 
and medical technology. This course stresses primarily the treatment 
and evaluation of experimental data and also the design of experiments. 


SWEDEN—Halvdan Astrand formerly Statistician with the Swedish 
Sugar Company has served the last year as Head of the new Institute for 
Agricultural Investigations. ‘This Institute was established by The 
Federation of Swedish Farmers’ Associations and the Swedish Farmers’ 
Union. They collect and analyze statistical materials concerning farm 
production and turnover, calculate production costs and index figures, 
study questions of food policy, standard of living and other related 
problems. .. . At the Royal Agricultural College and National Agricul- 
tural Research Center (Kungl Lantbrukshégskolan och Statens Lant- 
bruksférsék), Uppsala, an Institute of Statistics and Experimental 
Design was established on April 1, 1950. It is headed by I. Bachér, 
author of the parts on statistics of a Swedish handbook on design, tech- 
nique and analysis of field experiments (Handledning i férséksteknik, 
1939, now out of print). Martin D. Sandelius, who held a lectureship 
1949-50 at the Department of Mathematics, University of Washington, 
Seattle, later joined the staff of the Institute as an Assistant. The ac- 
tivities of the Institute are divided between teaching, consultation and 
research. . . . Nils Blomqvist, recently returned to Stockholm from a two 
years stay in the U.S. During his first year there, he studied mathe- 
matical statistics at Columbia University and the second year he served 
as Instructor in Mathematics and Mathematical Statistics at Boston 
University. Mr. Blomqvist is now in charge of the consultation activities 
of Statistiska Forskningsgruppen (as association of mathematical 
statisticians engaged in research and consultation). . . . Under the 
sponsorship of the Department of Mathematical Statistics at Stockholm 
University, seminars on the application of modern statistical methods 
to problems in biology and medical science were given last Spring. The 
lectures were delivered by Harald Cramér, Head of the Department, 
Gunnar Blom, Bertil Matérn and Leonard Goldberg. . . . Tore Dalenius, 
Stockholm, who studied the theory and methods of sample surveys in 
the U. S. 1947-48, under which time he served as Graduate Research 
Assistant at Cornell University, returned to the U. S. in January 1951, 
for a five months visit. During this latest visit, under the sponsorship 
of ECA, Mr. Dalenius studied wage statistics and related topics at 
Bureau of Labor Statistics and spent a period at Bureau of the Census, 
studying the use made of modern statistical methods in surveys carried 
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out by that agency. He is working now as Statistician with the Social 
Board. . . . Ulf Grenander, Stockholm, has been invited to spend one year 
at the University of Chicago, where he will participate in the research 
program on Statistical Inference. . . . The staff of the Statistiska Insti- 
tutionen of the University of Uppsala includes H. Wold, S. Malmquist, 
P. Whittle and K. Medin. Mr. Wold spent the first six weeks of 1951 
at the London School of Economics on invitation under their “Scheme 
of Nordic Studies”. He also gave seminars at the Galton Laboratory, 
in Oxford, Cambridge and Manchester. Mr. Malmquist spent part of 
the year in the U. 8. 


SWITZERLAND—Enmnst Horber, Entomologist at the Swiss Experi- 
mental Station for Agriculture, Zurich-Oerlikon, Switzerland, has joined 
our Society. He is interested in the application of statistical tools to 
research in applied entomology of field crops, as for instance Cockchafer 
larvae, wireworms, Colorado beetle. He writes, “The lectures of our 
A. Linder have opened me a new way to sampling for estimating insect 
populations and to judging efficiency of the different new insecticides and 
methods of Pest Control.” ... M. Pascua, Director of the Division of 
Health Statistics, directs and supervises the labors of the three Sections 
of the Division, Morbidity Statistics, Statistical Studies and Inter- 
national Statistical Classification of Causes of Death. Inherent to the 
position the duty of making analysis and studies on mortality and several 
have already appeared in the “Epidemiological and Vital Statistics 
Report”’, of the World Health Organization; also the secretaryship of the 
Expert Committees of the Organization dealing with Health Statistics. 


UNITED STATES—John C. Bain, Chief Statistician of the Abitibi 
Power and Paper Company, Toronto, Canada, has recently joined The 
Biometric Society. The Department of Statistical Research, which he 
heads, is doing work in a variety of fields including statistics such as 
experimental design in forest operations, manufacturing processes and 
chemical research; quality control; employee testing, besides econo- 
metrics (linear programming), actuarial science and machine computa- 
tion... . Herbert C. Batson reported May 1 as Professor, Department of 
Public Health, Illinois College of Medicine in Chicago and also as Re- 
search Associate, Division of Laboratories, Illinois Department of 
Public Health. He is giving beginning courses in Biometrics for graduate 
students in the medical sciences. Mr. Batson left the Biologic Products 
Division, Army Medical Service Graduate School, Army Medical Center, 
Washington, to take this position in Chicago. As soon as he learns to 
fold a newspaper properly he will be able to pass as a confirmed-com- 
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muter. . . . Geoffrey Beall has been Professor of Statistics in the Depart- 
ment of Mathematics at the University of Connecticut since September 
1950. Besides a teaching program he is responsible for consultation on 
statistical problems with faculty and graduate students in the University, 
generally. He was previously Statistician in the Research Laboratories 
of Swift and Company... . . Charles A. Bicking is advising the various 
branches and field installations of the Ordnance Research and Develop- 
ment Division in the design of experiments. He is located with Leslie 
_ E. Simon, Chief of the Division, in the Office of the Chief of Ordnance, 
Washington, D.C. Mr. Bicking came to this position from the Hercules 
Powder Company where he was Quality Control Engineer. . . . Richard 
H. Blythe, Jr. is with the Operations Analysis office at Headquarters Air 
Defense Command, Colorado Springs. . . . Kenneth A. Brownlee of 
E. R. Squibb and Sons is now with Technical Operations Service, 
Dugway Proving Ground, Tooele, Utah. . . . Martin A. Brumbaugh, 
Director of Statistics at Bristol Laboratories, Inc., Syracuse, New York 
received the Shewhart Medal for outstanding service and leadership 
in the field of quality control. ... Paul T. Bruyere and Martha C. Bruyere 
were appointed by Arthur Linder, President, to act as observers for The 
Biometric Society at the meeting of the World Health Organization 
Regional Committee for the Americas, Third Session, held in Washin<- 
ton, August 20-27... . Joseph M. Cameron has been appointed assistant 
chief of the Statistical Engineering Laboratory of the National bures. 
of Standards. He succeeds W. J. Youden, who is thus freed of adminis- 
trative duties to devote full time to his work as general consultant to the 
Bureau on experiment design and the analysis of experimental data. . . . 
Kai Lai Chung, who spent last year at Columbia University in the 
Department of Mathematical Statistics is presently employed by the 
Department of Mathematics, Cornell University, Ithaca, New York, 
and is working on an Air Force contract. . . . Edward L. Corton, Jr., 
Naval Ordnance Test Station, China Lake, California has been ap- 
pointed Meteorologist in the field of Marine Climatology at the 
Navy Hydrographic Office, Washington, D. C. . . . Bliss H. Crandall, 
Director, Statistical Laboratory, Utah State Agricultural College, Logan 
teaches and spends much time in consultation with project leaders. 
Together they review objectives, facilities available and precision desired 
for a particular problem and select those treatments for trial which seem 
best to meet the objectives. They have a system worked out which 
enables tabulating field plans by machine. He states, “Work cards are 
taken to the field and data marked with graphite pencils. The marks 
are later sensed electrically and holes punched directly into the cards 
which are then immediately available for summary. This system im- 
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proves the accuracy of our research and keeps the summary of data 
up-to-date”. . . . James George Darroch, departed from the Iowa State 
College Statistical Laboratory in January to go to Washington State 
College, Pullman as Experiment Station Statistician and Associate 
Professor of Agriculture. He has completed all course work and plans 
to complete his Ph.D. thesis in absentia. Others of the same status are 
Virgil Anderson, who has gone to Purdue University, Indiana as Assist- 
ant Professor with the Agricultural Experiment Station and the Statistics 
Section of the Department of Mathematics; Daniel Horvitz to the Uni- 
versity of Pittsburgh as Assistant Professor in the Department of Bio- 
statistics and the Mellen Research Institute; and John Hofmann to 
Dugway Proving Grounds, Technical Operations, Tooele, Utah. .. . 
James W. Degan, University of California, is now with the Research 
Laboratory of Electronics, Massachusetts Institute of Technology, 
Cambridge. . . . Arthur M. Dutton joined the staff of the University of 
Rochester, after receiving his Ph.D. in statistics at Iowa State College 
iast June... . Evelyn Fix, Instructor and Research Associate, Statistical 
laboratory, University of California, Berkeley, was promoted to 
\ssistant Professor and Research Associate effective July 1, 1951... . 
Joseph A. Greenwood has been with the Bureau of Aeronautics, Navy 
Department for nine years. He is Chief Consultant for Engineering 
S:atisties and as such is responsible for the introduction of statistical 
design and analysis into projects carried out by the bureau field activities, 
«nd for adequate sampling provisions in specifications. Mr. Greenwood 
was previously with Duke University as Assistant Professor of Mathe- 
matics and Statistical Consultant to the Parapsychology Laboratory. .. . 
Richard J. Henry, M.D. with the Bio-Science Laboratories, Inc., Los 
Angeles is a new member of our Society. He is interested in the applica- 
iions for statistical tools in his research in bacteriology, bio-chemistry 
end medicine. His wife, Maryon D. Henry is also a member. . . . Henry 
Higley, Ios Angeles Chiro College, joined The Biometric Society in 
i951. He is interested in the use of statistical methodology for the 
planning and evaluation of research in biophysics at present, he is work- 
ing on a study of the Iontophoresis of Antibiotics. These data are being 
evaluated by the aid of Analysis of Variance. . .. J. L. Hodges, Jr., 
Assistant Professor and Research Associate, Statistical Laboratory, 
University of California, Berkeley, will be on leave of absence in a 
visiting capacity at the University of Chicago during the academic year 
1951-52. . . . A course in the techniques of preparing problems for high 
speed digital computing machinery will be given at Oak Ridge from 
iecember 3-14, 1951, by the Special Training Division of the Oak Ridge 
Institute of Nuclear Studies in cooperation with the Oak Ridge National 
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Laboratory. The course will be centered around the computing machine 
to be installed at'Oak Ridge National Laboratory, although modifica- 
tions of machines and preparation techniques will be taken up. The 
Oak Ridge National Laboratory machine will be an electronic automatic 
computer, single address type, patterned after the Institute for Ad- 
vanced Study Computer. Alston S. Householder, Chief of the Mathe- 
matics Panel, Oak Ridge National Laboratory, will be the course 
director. Assisting him will be staff members of Argonne Nationa! 
Laboratory, the Los Alamos Scientific Laboratory, the Computer 
- Branch of the Office of Naval Research, and the Institute for Advanced 
Study. The course will be limited to 30 participants. Application 
blanks and additional information may be obtained from Ralph T. 
Overman, Chairman, Special Training Division, Oak Ridge Institute o{ 
Nuclear Studies, Oak Ridge, Tennessee. . . . Henry R. Kathrein, Bac- 
teriologist with The Samuel Roberts Noble Foundation, Ardmore, 
Oklahoma, has become a new member of our Society. The scientists at. 
the Foundation do research in such fields as plant nutrition, nutrition oi 
microorganisms, and metabolism of proteins and carbohydrates and have 
found statistical methods useful. S. M. Free was with Nobel Foundatioz: 
before taking up his study of statistics at the Institute of Statistics. . . . 
O. Kempthorne, Statistical Laboratory, Ames, was promoted to Pro- 
fessor of Statistics July 1. His book “The Design and Analysis of 
Experiments” which will contain approximately 600 pages is in press and 
will be issued by John Wiley and Sons. . . . Jack Kiefer, who did his work 
in statistics and mathematics at Columbia University, is now a member 
of the Department of Mathematics, Cornell University, Ithaca, New 
York. . . : Erich L. Lehmann, Assistant Professor and Research Associate, 
Statistical Laboratory, University of California, Berkeley has been 
promoted to Associate Professor and Research Associate effective July 
1, 1951. During the academic year 1951-52, Mr. Lehmann will be in 2 
visiting capacity at Stanford University, on leave from the University of 
California. . . . Eugene Lukacs is with the Statistical Engineering 
Laboratory, National Bureau of Standards. He is engaged in research 
particularly on autoregressive series and stochastic processes. Mr. 
Lukacs went to the Bureau from the U. S. Naval Ordnance Test Station 
at Inyokern, California, where he was Head of the Statistics Branch. 
The position as Statistician for the U. S. Naval Ordnance Test Station 
is being held now by Paul Peach, who was a member of the staff of the 
Institute of Statistics, Raleigh for several years. . .. Garnet MacCreary, 
formerly of the University of Winnipeg, Manitoba, Canada, has been 
hired as statistical consultant for a project on the study of the incidence 
of mental disorders. This project is directed by Alexander Leighton, 
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Cornell University. . . . S. E. A. McCallan, Plant Pathologist, with 
Boyce Thompson Institute, Yonkers, New York, has joined The 
Biometric Society. He is interested in the use of statistical tools for 
planning and evaluating research on fungicides. . . . Charles M. Mottley 
is Chief of the Combat Operations Team in the Operations Analysis 
Division, of U. 8. Air Force Headquarters. Walter L. Deemer, Jr. is 
on his team, a Colonel among twenty civilian scientists. . . . Sidney I. 
Neuwirth, Statistician, with Schering Corporation, Bloomfield, New 
Jersey, has joined The Biometric Society. He is interested in the theory 
and applications of statistical methods to bioassays. . . . Pauline Paul is 
with the Foods and Nutrition Department, Michigan State College, 
East Lansing. She is teaching and doing research in food preparation 
problems where she finds statistical methodology useful, especially in 
the design and analysis of problems on tenderness of meat. One of the 
courses offered for graduate students in the department is a seminar on 
applications of statistics in foods and nutrition research problems... . 
James A. Rafferty is a member of the Atomic Warfare Team in the 
Operations Analysis Division, Headquarters, U.S. A. F. Previously he 
was with the School of Aviation Medicine, Randolph Air Force Base, 
Texas. .. . H. G. Romig has affiliated himself with the Hughes Aircraft 
Company, Culver City, California. He has accepted the position as 
Staff Engineer to coordinate the Quality Control work there. He came 
to this position from the Bell Telephone Laboratories, New York, with 
which organization he had been associated for the past twenty five 
years. ... Marion M. Sandomire, formerly with the Bureau of Census, 
is now located in the New York Operations Office of the Atomic 
Energy Commission. . . . Vincent Schultz, Project Leader of a State-wide 
Wildlife Survey of Tennessee, has joined our Society. He is interested 
in the application of statistical methods to wildlife research with par- 
ticular emphasis being placed on survey methods. . . . Elizabeth L. Scott, 
Instructor and Research Associate, Statistical Laboratory, University 
of California, Berkeley, was promoted to Assistant Professor and Re- 
search Associate effective July 1, 1951... . P. V. Sukhatme, Chief of 
the Statistics Branch, Food and Agriculture Organization of the United 
Nations, will be Visiting Professor of Statistics at Iowa State College 
during the Spring Quarter, beginning March 27, 1952. Mr. Sukhatme 
will give lectures in advanced survey sampling. He was formerly 
Statistical Adviser to the Indian Council of Agricultural Research at 
New Delhi, India, and the lectures will be based on his research in 
sampling theory and applied sampling while he was employed by the 
Indian government. . . . P. C. Tang who was formerly Statistician with 
the Food and Agricultural Organization of the United Nations, is now 
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Associate Professor of Statistics in the Statistical Laboratory, Iowa 
State College, Ames. . . . Gerhard Tintner, Iowa State College, Ames, 
was selected to be an Associate Editor of Econometrica, as well as 
Editor of the book review section of that publication. . . . Jacob Wolfo- 
witz, formerly of the Department of Mathematical Statistics at Columbia 
University, is now Professor of Mathematics in the Department of 
Mathematics at Cornell-University, Ithaca, New York. In addition to 
courses in mathematical statistics, he will continue with his research 
studies and with work under Navy contract. 


1951 Summer Session in the Statistical Laboratory, University of 
California, Berkeley. In the first of two six-week summer sessions at the 
University of. California, courses on an advanced and graduate level 
were Offered, as well as the first course in mathematical statistics and 
probability. Maurice G. Kendall, Professor of Statistics at the London 
School of Economics and Political Science and J. Neyman, Direcivc ~i 
the Berkeley Laboratory, directed the advanced and graduate courses 
and G. E. Noether of Boston University, assisted by Grace E. Bates, of 
Mount Holyoke College, conducted the first course and a laboratory 
course in applications. In the second session, Professor Neyman’s 
graduate course in individual research continued and the second course 
in statistics and probability, together with laboratory work in applica- 
tions, was given by C. R. Blyth, University of Illinois, Miss Bates 
assisting. The various research activities of the Statistical Laboratory 
were centered in the following three groups: The first group was con- 
cerned with a complex of questions arising from considerations of super- 
efficiency and identifiability. For example, (1) When looking for 
asymptotically normal estimates of parameters, under what conditions 
and how far must one go to improve on maximum-likelihood estimates? 
(2) Suppose the distribution of a set of observable random variables 
changes with a change of parameter (i.e., suppose the parameter is 
identifiable). Is it always possible to estimate this parameter consis- 
tently? Those most concerned with this topic were J. L. Hodges, Jr., 
and L. LeCam of the Laboratory, and Agnes Berger, New York City. 
The second group was interested in applications and methodology of 
medical follow-up procedures. This topic tied up with the problem of 
accident proneness in which two alternative models were considered: 
(1) The Polya contagious scheme in which each individual “begins’’ with 
the same proneness to accidents but accidents incurred by an individual 
and the influence of time-effect (effect of experience gained) contribute 
to his probability of incurring accidents in the future and (2) The 
Greenwood-Yale-Newbold mixture scheme in which there is no conta- 
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gion, no time-effect, but the individuals differ from each other in regard 
to their proneness to accidents. Persons involved in this project were 
J. Neyman, Evelyn Fix of the Laboratory, Lila Elveback of the Uni- 
versity of Minnesota, and Grace E. Bates. The third area of reseach 
was in the field of astronomy. J. Neyman and E. L. Scott of the Labora- 
tory have been working closely with the Lick Observatory and Otto 
Struve of the Astronomy Department, University of California, on this 
project. One of the most interesting problems considered was that of 
estimating the frequency and extent of extra-galactic absorbing clouds 
from the distribution of apparent magnitudes of stars. A similar 
problem was concerned with extra-galactic clusters of nebulae. These 
questions proved to be interesting in affording further applications of 
stochastic processes. In addition to the specific areas of research dis- 
cussed above, the Laboratory staff members have met many requests 
for consultation on diverse topics. 


Summer Session at the Institute of Statistics, Raleigh, N.C. Continu- 
ing a policy established in 1941 the Institute of Statistics conducted its 
fourth regular summer session designed for consultants, teachers, and 
students in statistics as well as for research scholars in other sciences. A 
grant from the General Education Board assisted the Institute in bring- 
ing to the campus G. W. Snedecor, founder and for 15 years Director of 
the Statistical Laboratory at Iowa State College, and W. J. Youden, 
Assistant Chief of the Statistical Engineering Section of the Bureau of 
Standards and for many years an outstanding consultant to research 
workers in the physical and biological sciences. In addition to these 
distinguished visitors the faculty presented the following regular mem- 
bers of the Institute staf!: R. L. Anderson, R. C. Bose, Gertrude M. Cox, 
A. L. Finkner, R. J. Monroe, S. N. Roy, and H. Fairfield Smith. Over 
70 students were enrolled in the various courses offered. Many of these 
were research people sent by such industrial concerns as Esso Standard 
Oil, Monsanto Chemical Company, Joseph E. Seagram, Husky Oil 
Company, Gulf Research and Development, and National Leai. Others 
were teachers in various colleges and universities, consultant. brishing 
up on latest statistical techniques, and students seeking course work 
towards advanced degrees. There were fewer foreign students than 
usual, due undoubtedly to world conditions, but the following countries 
were represented: Canada, Egypt, Puerto Rico, Brazil, Denmark, and 
India. In view of the continuing demand for such course work and the 
opportunity for informal association with people having many statistical 
interests in common the Institute plans to continue this supplementing 
of the regular academic program. 
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The Statistical Laboratory, Iowa State College, Ames, participated in 
a Statistical Workshop for Home Economics Research Workers through- 
out the United States which was held on the Ames campus July 2-21 
and sponsored by the Research Training Committee of the American 
Home Economics Association. Bernard Ostle was statistician-in-charge. 


Virginia Polytechnic Institute Statistical Summer School. The Sta- 
tistical Summer Session held at the Virginia Polytechnic Institute from 
August 8 to August 25 drew graduate students from twelve states and 
three foreign countries. These students included army personnel, re- 
search personnel from industry, college professors, economists, agrono- 
mists, mathematicians, foresters, physicists, textile personnel, public 
health personnel, biologists, and sociologists. Total enrollment exceeded 
50. A series of special lectures were given during this session. Maurice 
Kendall, Professor at the University of London, lectured on the role of 
statistics in research; S. Lee Crump, University of Rochester, on variance 
component analysis with special reference to experimental and sampling 
designs; Walter A. Hendircks, Bureau of Agricultural Economics, on 
sampling; Allyn Kimball, Oak Ridge National Laboratory, on micro- 
biological assays with non-parallel response curves and on dependent 
tests of significance in the analysis of variance; Paul S. Olmstead, Bell 
Telephone Laboratories, on how to detect the type of an assignable 
cause; and Harrison D. Stalker, Washington University, St. Louis, on 
the application of statistics to the study of speciation. Bound copies of 
the special lectures may be obtained from the Statistical Laboratory at 
a nominal charge. 


Seminars in Connecticut. The second inter-university Summer 
Seminar in Statistics was held on the campus of the University of 
Connecticut at Storrs August 6-30, 1951. Each of the four weeks was 
devoted to a distinct field of statistics with sessions arranged by different 
teams of organizers. The program of the first week on applications to 
biology was arranged by C. I. Bliss and J. Ipsen. Three afternoon 
seminars featured papers by J. W. Tukey on “Simplified analysis of 
randomized blocks, Latin squares and other designs,’”’ by J. Ipsen on 
“Biometric problems in epidemiology,” and by M. L. Greenwood on 
‘“‘A comparison of scoring methods for taste tests.” Morning clinics 
under the chairmanship of Geoffrey Beall and C. I. Bliss considered 
problems in animal nutrition, in general biometry and in the design and 
analysis of taste tests. The week closed with three round table discus- 
sions on effective official test procedures for the control of quality in the 
production of drugs. These considered statistical concepts in relation to 
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pharmacopeial requirements with Lloyd C. Miller in the chair, sterility 
and pyrogen testing with W. E. Gaunt as chairman and biological stan- 
dardization under the chairmanship of R. H. Noel. Registered attend- 
ance during the first week numbered 74. 

In the remaining three weeks of the Seminar the programs concerned 
Time Series, arranged by M. G. Kendall and J. W. Tukey; Statistical 
Theory and Probability, arranged by M. Kac and H. Robbins; and Tech- 
niques of Interest in the Social Sciences, with F. Mosteller, F. L. 
Strodtbeck, and M. A. Woodbury as organizers. The seminars were 
characterized by easy informality and free discussion among the pro- 
fessional statisticians, students and consumers of statistical techniques 
who attended. (Information concerning the 1952 sessions is available 
from D. F. Votaw, Jr., Leet Oliver Memorial Hall, Yale University, 
New Haven, Connecticut.) 
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Lattices, 145, 287 
LD-50, 295, 302, 339 
Least squares, 118, 233, 309 
Likelihood-ratio test, 19, 28, 125 
Limiting distribution, 26 
Linear dependence, 269 
Linear relation, 17. 
Linked-block design, 124, 125 
Logarithmic distribution, 121 
transformation, 113 
Logistic curve, 247, 327 
Logit, 327 
Mass action, 332, 435 
Mathematical biology, 134, 180, 228, 
335, 391 
model, 1, 115, 135, 185, 197, 253, 
335, 355 
Maximum likelihood, 4, 136, 240, 268, 
295, 328, 404 
Medical research, 309, 320, 435 
Method of moments, 142, 247, 404 
Minimum chi square, 338 
Moment generating function, 402 
Moving average, 302 
Multiple regression, 301 
Multivariate analysis, 37 
Mutation, 336 
Negative binomial, 356, 358, 403 
Non-normality, 6 
Non-parametric, 36 
Nomogram, 206 
Normal deviate test, 26 
Normal distribution, 337 
Nuisance parameter, 5, 19, 24 
Null hypothesis, 7, 17 
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Orthogonality, 33, 47 
Over-dominance, 18 
Paired comparisons, 125, 312, 316 
Parasitology, 310 
Pharmacology, 171, 227 
Phylogeny, 121 
Physiology, 123, 185, 336 
Poisson distribution, 356, 395, 400 
Population growth, 332 
Power function, 6, 21, 24, 312 
Precision, 36, 59, 92, 102, 315 
Probability paper, 328 
Probits, 302, 327 
Psychology, 36, 114, 340, 376 
Quality control, 15 
Quantal response, 334 
Radiology, 126, 336 
Randomization, 47, 309 
Randomized blocks, 115, 299 
Random mating, 18 
Range, 222 
Rank analysis, 125 
Rat, 284 
Rectangular lattice, 145, 287 
Regression analysis, 33, 42 
coefficient, 45, 51, 65, 195 
equation, 187 
non-linear, 210, 247, 324 
Reliability, 61 
Replacement variance, 56 
Reproductive rate, 155 
Sampling, 46, 83, 97, 104, 155, 300 
double, 275 
Sampling distribution, 4, 5 
Sampling variance, 4, 137 


Scores, 268 
Selection of data, 47, 48, 299 
Separation of variance, 103 
Serial correlation, 315 
Serology, 323 

Significance, see test of 
Significant cut, 50 

Soils, 301 

Stochastic process, 121 
Structure, 36, 64 

Sufficient statistic, 338 
Survival time, 120 
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Systematic designs, 167, 309, 319 Unequal subclasses, 10 
Test of significance, 20, 26, 56, 73, 114, Uniformity trial, 300, 310 
315 : ' Variance, 17, and see analysis, com- 
choice of, 51 ponents 
distribution-free, 116 ‘homogeneity of, 19, 189 
Tetrads, 60 Viability, 271 
Tetraspores, 118 Weights, 64, 291 
Transformations, 113 Wool, 83 
Truncated distribution, 120 - ‘Yates’ correction, 222 
t test, 299, 315 Youden squares, 118 


Two-tailed test, 19, 222 Zoology, 300 
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